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EXECUTIVE  SUMMARY _ _ _ _ _ 

Requirement  : 

The  US  Army  Research  Institute  conducts  research  on  manpower, 
personnel,  and  training  issues  of  interest  to  the  Army.  Recently 
there  has  been  a  need  for  basic  research  into  objective  ways  of 
measuring  and  evaluating  organizational  performance  efficiency 
under  different  policies  and  resource  allocation  procedures. 

Procecu r  e : 

Building  upon  their  previous  theoretical  work  in  mathematical 
programming,  the  authors  have  generalized  the  concept  of  Data 
Envelopment  Analysis  to  include  new  theoretical  characterizations 
of  empirical  production  functions. 

Findings: 

The  developments  show  how  a  Par e t o- 0 p t i m a  1  frontier  production 
function  can  be  developed,  and  such  problems  as  economies  of  scale, 
isotonicity  and  non-concavity,  discretionary  and  non-discretionary 
inputs,  and  Cobb-Douglas  multiplicative  functional  problems  can  be 
solved.  Also,  simulations  are  performed  which  demonstrate  that  DEA 
methodology  is  not  only  superior  to  other  methods  (ratio  analysis 
and  regression  analysis)  for  identifying  inefficiencies  but  also 
for  locating  their  sources  and  estimating  their  magnitude  in 
particular  decision  making  units. 

Utilization  o f  Findings: 

Methodologies  developed  here  provide  new  approaches  for 
measuring  the  efficiency  and  productivity  of  organizations  that 
have  multiple  inputs  and  outputs.  Tms  methodology  could  be 
applied  to  resource  allocation  and  evaluation  problems  in 
recruiting,  training,  unit  performance,  equipment  maintenance, 
personnel  management,  logistic  management,  and  weapon  system 
development  . 


1 1  i 


I.  INTRODUCTION . 

II.  PARETO  OPTIMALITY,  EFFICIENCY  ANALYSIS,  AND  EMPIRICAL 

PRODUCTION  FUNCTIONS . 

III.  INVARIANT  MULTIPLICATIVE  EFFICIENCY  AND  PIECEWISE 

COBB-DOUGLAS  ENVELOPMENTS . 

IV.  A  COMPARATIVE  STUDY  OF  DATA  ENVELOPMENT  ANALYSIS 
AND  OTHER  APPROACHES  TO  EFFICIENCY  EVALUATION 

AND  ESTIMATION . 

APPENDIX . 


LIST  OF  TABLES 


Table  1 . 

2  . 

3  . 

4  . 


DEA  Ratings  of  Artificial  DMUs . 

Single  Out-  t  Measures . 

Comparison  of  DEA,  Ratio  Analysis,  and 
Linear  Regression  Approaches  Ability  to 

Locate  Inefficient  DMUs . 

H 1 5  Intensity  Adjustment  and  Efficiency  Value... 


LIST  OF  FIGURES 


Figure  1 . 

2  . 
3  . 


Empirical  Production  Possibility  Set 
Isotonic  Function  with  Concave  Cap.. 
Concave  Production  Frontier . 


I  I.  INTRODUCTION 

Economists  and  management  scientists  have  long  been  interested 
in  production  functions,  or  the  relationship  of  resources  to 
organizational  outputs.  Data  envelopment  analysis  (DEA),  developed 
by  Charnes,  Cooper,  and  Rhodes  (1978)  provides  a  new  methodology 
for  measuring  the  technical  efficiency  of  organizations  that  use 
multiple  inputs  to  produce  multiple  outputs. 

Data  envelopment  analysis  has  contributed  to  both  basic  and 
applied  research  in  efficiency  analysis.  It  is  basic  in  the  sense 
that  it  provides  a  new  mathematical  model  for  describing  behavior 
of  organizations  in  the  transforming  inputs  to  outputs.  It  is 
applied  since  it  relies  upon  empirical  data  with  direct 
implications  for  identifying  specific  inefficiencies  and 
redirecting  management  effort.  It  is  ideally  suited  for  the 
evaluation  of  public  sector  institutions,  because  it  can  deal  with 
multiple  outputs  and  does  not  require  information  on  prices.  DEA 
has  been  applied  to  education  (Bessent,  1983),  health  care 
(Sherman,  1981),  Navy  recruiting  (Lewin  and  Morey,  1980),  criminal 
court  systems  (Lewin  and  Morey,  1984),  and  computer  software 
evaluation  (Barr,  1983). 

The  following  sections  of  this  report  describe  basic  research 
that  has  extended  and  improved  the  mathematical  models  available 
for  analyzing  organizational  efficiency.  Sectionll  provides  a  new 
method  of  data  envelopment  analysis  methodology  that  is  a 
substantial  improvement  over  the  original  approach.  This  new  model 
permits  the  analysis  of  the  rates  of  change  of  individual  outputs 
with  respect  to  change  in  specific  inputs.  Further,  the  new  model 
improves  the  computational  algorithm  by  only  searching  the  optimal 
points  in  the  solution  space. 

Section  III  provides  a  multiplicative  efficiency  moot- 1  . 
Previous  formulations  had  been  sensitive  to  the  units  of 
measurement.  Here,  a  simpie  change  is  formulated  that  preserves 
the  desirability  of  the  multiplicative  format  and  creates  invariant 
measures  of  efficiency. 
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The  last  section  compares  DEA,  ratio,  and  regresion  analysis 
through  investigation  of  an  artificial  data  base.  The  results 
favor  DEA  not  only  for  identifying  inefficiencies  but  also  locating 
their  sources  and  amounts.  The  advantage  of  DEA  is  that  it 
performs  a  separate  optimization  for  each  observation  and  does  not 
attempt  to  capture  a  great  varieties  of  behaviors  in  a  smooth  ana 
simple  functional  form. 

Efficiency  analysis  as  developed  and  extended  in  this  report, 
contains  substantial  potential  for  improving  the  resource 
allocation  and  evaluation  process  within  the  Army.  An  application 
has  already  been  made  to  recruiting  research  management  (Charnes, 
1982).  Additional  areas  that  could  benefit  from  efficiency 
analysis  include  training  unit  performance,  weapon  system 
development,  equipment  maintenance,  logistic  management,  anc 
personnel  management. 
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II.  PARETO  OPTIMALITY,  EFFICIENCY  ANALYSIS  AND  EMPIRICAL  PRODUCTION  FUNCTIONS 


Classically,  the  economic  theory  of  production  is  heavily  based 
on  the  conceptual  use  of  the  Pareto-eff iciency  (or  Pareto-optimal)  frontier 
of  production  possibility  sets  to  define  "the"  production  function.  The 
work  of  R.  Shephard  [is] ,  [19]  under  severe  restrictions  on  the  mathematical 
structure  of . production  possibility  sets  and  cost  relations,  developed  an 
elegant  "transform"  theory  between  production  aspects  and  cost  aspects  [10]. 
This  was  applied  to  various  classes  of  explicitly  given  parametric  functional 
forms  and  problems  of  statistical  estimation  of  parameters  from  data  were 
considered  in  classical  statistical  contexts  especially  by  successors 
such  as  R.  Frisch,  S.  Afriat,  0.  Aigner,  F.  Forsund  [1,  2,  16]. 

These  efforts  were  almost  exclusively  for  single  output  functions. 

M.J.  Farrell  in  [14],  seeking  to  disentangle  prices  or  costs  from 
"technical"  aspects  of  production ,  as  well  as  to  provide  a  more  meaningful 
technical  setting  to  statistical  and  empirical  aspects  of  production, 
defined  (for  the  single  output  case)  a  measure  of  "technical  efficiency' 
of  observed  production  units  relative  to  the  total  units  observed  assuming 
that  the  production  process  of  inputs  to  output  conversion  was  linear  and  of 
constant  returns  to  scale. 

Building  on  the  unit-by-unit  evaluations  of  Farrell  and  the 
engineering  ratio  idea  of  efficiency  measure  for  a  single  input  and  output, 
efficiency  analysis  in  its  managerial  aspects  and  its  constructive 
extensions  to  multi-input,  multi-output  situations  was  initiated  by  Charnes, 
Cooper  and  Rhodes  in  [8],  [9].  Subsequent  extensions  and  elaborations 
by  the  former  pair  with  other  students  and  colleagues  were  made  in  [  7  ], 

[ll],  [12]  .  .  .  with  more  attention  to  classical  economic  aspects  and  to 
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the  production  function  side  of  the  mathematical  duality  structure  and 
Data  Envelopment  Analysis  first  discovered  in  the  CCR  work.  The  CCR  ratio 
measures  and  the  variants  of  Farrell,  Shephard,  FSre,  Banker,  et  require, 
however,  non-Archimedean  constructs  for  rigorous  theory  and  usage.  Their 
solution  methods  also  do  not  easily  provide  important  needed  properties  of 
their  associated  empirical  production  functions. 

Thus,  in  this  paper  we  introduce  as  basic  the  idea  of  Pareto 
optimality  with  respect  to  an  empirically  defined  production  possibility  set. 
We  characterize  the  mathematical  structures  permitted  under  our  minimal 
assumptions  and  contrast  these  with  others'  work.  Properties  such  as 
isotonicity,  non-concavity,  economies  of  scale,  piece-wise  linearity,  Cobb- 
Douglas  forms,  di scretionary  and  non-di scretionary  inputs  are  treated  through 
a  new  Data  Envelopment  Analysis  method  and  informatics  which  permits  a 
constructive  development  of  an  empirical  production  function  and  its  partial 
derivatives  without  loss  of  efficiency  analysis  or  use  of  non-Archimedean 
field  extensions. 
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EMPIRICAL  FUNCTION  SETTING  AND  GENERATION 


By  an  "empirical"  function  we  shall  mean  a  vector  function  whose 
values  are  known  at  a  finite  number  of  points  and  whose  values  at  other 
points  in  its  domain  are  given  by  linear  (usually  convex)  combinations 
of  values  at  known  points.  The  points  in  the  domain  are  "inputs,"  the 
component  values  of  the  vector  function  "outputs."  We  shall  assume  that 
inputs  are  so  chosen  that  convex  combinations  of  input  values  for  each 
input  are  meaningful  input  values.  We  assume  this  for  output  values  as  w 
In  efficiency  analysis,  observations  are  generated  by  a  finite 
number  of  "DMU"s,  or  "productive,"  or  "response"  units,  all  of  which  have 
the  same  inputs  and  outputs.  A  relative  efficiency  rating  is  to  be 
obtained  for  each  unit.  Typically,  observations  over  time  will  be  made 
of  each  unit  and  the  results  of  efficiency  analyses  will  be  employed  to 
assist  in  managing  each  of  the  units.  We  assume  n  units,  s  outputs  and 
m  inputs.  The  values  are  to  be  non-negative  (sometimes  positive)  numbers 
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A  HYPOGRAPH  EMPIRICAL  PRODUCTION  POSSIBILITY  SET 

Given  the  (empirical)  points  (X.,Y.),  j=l,...,n  with  (mxl)  "input'1 

J  J 

vectors  X.  >  0  and  (sxl)  "output"  vectors  Y.  >  0,  we  define  the  "empirical 
J  J 

production  set"  P^.  to  be  the  convex  hull  of  these  points  i.e. 

n  n 

(2.1)  PF  A  { (x ,y )  :  x  =  E  X.y  ,  y  =  E  Y.p.  ,  Vp.  ^  0  ,  Eu,  =  1  • 

j=l  J  J  j=1  J  J  J  j  J 

We  extend  it  to  our  "empirical  production  possibility  set"  by  adding  to 
Pj-  all  points  with  inputs  in  and  outputs  not  greater  than  some  output 
in  P^  i.e. 

(2.2)  d£^{(x,y)  :  x=x  ,  y  <  y  for  some  (x,y)  c  Pr) 

Note  that  is  contained  in  (e.g.  is  smaller  than)  every  production 
possibility  set  heretofore  employed,  i.e.  those  studied  by  Farrell  [14], 
Shephard  [19],  Banker,  Charnes  and  Cooper  [3],  Fare,  et  aK  [13],  etc.  The 
Farrel 1,  Shephard,  Fare  sets  are  (truncated)  cones;  the  BCC  set  (when  not 
also  a  cone)  adds  to  the  set 

{(x,y)  ;  x  ^  x  ,  y  =  y  for  some  (x,y)  c  Q^). 

These  relations  may  be  visualized  in  the  schematic  plot  of 

Figure  1:  / 

/ 

y  t  x 

I  / 

I  / 

7  c 
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where  5  U  A  ,  the  BBC  set  is  2^  U  B,  and  the  Farrell ,  Shephard,  Fare 
set  is  U  B  U  C . 

Let  p£  ,  denote  the  sets  corresponding  to  and  when  only  the 
output  y  is  the  ordinate.  Evidently  a  frontier  function  f  (x)  is  determined 

si  71 

by 

(2.3)  f^(x)  =  max  y^  for  (x.y^)  e 

Then, 

Theorem  0:  is  the  hypograph  of  f^(x)  over  {x  :  (x,y)  e 

Proof:  The  hypograph  of  f^(x)  is  the  set 

h4  A  {(x.y4)  :  <  fA(*)  »  (x>y)  e  <2e} 

Let  denote  (x  :  (x,y)  e  QE).  It  is  the  domain  (the  input  set) 
of  our  empirical  frontier  functions. 

Theorem  1:  f^(x)  is  a  concave,  piecewise  linear  function  on  V 

Proof:  A  necessary  and  sufficient  condition  for  f  (x)  to  be  concave  is 

that  its  hypograph  is  a  convex  set  (cf.  Rockefellar  [17],  or  Fenchel  [15]). 

The  piecewise  linearity  also  follows  from  the  construction  of  Qr  by  all  convex 

combinations  of  the  empirical  points  (X.,Y.)»  j=l,...,n. 

J  J  a 

We  observe  explicitly  further  that  no  use  whatever  has  been  made  of 
non-negativity  of  input  and  output  values  in  the  sets,  functions  or  proof 
of  Theorems  0  and  1.  Therefore,  they  hold  without  this  restriction--a 
fact  we  shall  employ  elsewhere. 

Also,  no  assumptions  have  been  made  about  the  properties  of  any 

underlying  function,  or  function  hypograph,  from  which  the  (X.,Y.)  of  our 

J  J 

empirical  construct  may  be  considered  samples.  Theorem  1  shows,  therefore, 
that  any  empirical  (maximum)  frontier  function  is  the  "concave  cap"  function 
of  its  graph. 
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THE  EMPIRICAL  PARETO-OPTIMAL  PRODUCTION  FUNCTION 

A  Pareto-optimum  for  a  finite  set  of  functions  g^(x) , . . .  ,gK(x)  is 
♦ 

a  point  x  such  that  there  is  no  other  point  x  in  the  domain  of  these 
functions  such  that 

(3.1)  gk(x)  <  gk(x*)  ,  k=l , . . .  ,K 

with  at  least  one  strict  inequality.  Charnes  and  Cooper  in  [5  ],  Chapter  IX, 

showed  that  x*  is  Pareto-optimal  iff  x*  is  an  optimal  solution  to  the 

mathematical  (goal)  program 
K 

(3.2)  min  £  9k(x)  subject  to  gk(x)  <  9k(x*)  *  k=l,...,K 

k-1 

This  was  employed  by  Ben-Israel,  Ben-Tal  and  Charnes  in  [4]  to  develop 
the  currently  strongest  necessary  and  sufficient  conditions  for  a  Pareto- 
optimum  in  convex  programming. 

Utilizing  (3.2)  we  can  now  define  and  construct,  im(or  ex-)plicitly 
the  Pareto-optimal  (or  "Pareto-efficient")  empirical  (frontier)  production 
function.  Other  usages  of  (3.2)  to  generalizations  such  as  the  "functional 
efficiency"  of  Charnes  and  Cooper  [5  ]  will  not  be  developed  here. 

First,  by  (3.2),  the  Pareto-optimal  points  (inputs!)  among  our  n 
empirical  points  can  be  determined.  The  empirical  Pareto-optimal  function 
is  then  defined  on  the  convex  hull  of  these  points  by  convex  combinations 
of  the  "output"  values.  Note  that  the  convex  hull  of  the  Pareto-optimal 
points  might  not  include  all  of  V ^  since  only  the  doubled  line  portion  of 
the  frontier  is  Pareto-optimal. 

Since  for  efficient  production  we  wish  to  maximize  on  outputs  while 
minimizing  on  inputs,  our  relevant  9k(x)  include  both  outputs  and  inputs,  e.g. 
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(3.3) 


r  yk  .  1  <  k  <  S 

*9k(x)  iA  ,  _Xi  ,  k  =  s+i,  i  =  l . m 

l  for  (x,y)  c  Q£ 

For  the  optimi zation  in  (3.2)  we  clearly  need  only  consider  (x,y)  e 
rather  than  Thus  the  constraint  inequalities  in  (3.2)  are  for  a  test 
point  (x*,y *): 

(3.4)  y  >  y*  ,  x  <  x* 
and  we  have 

Theorem  2:  The  envelopment  constraints  of  Data  Envelopment  Analysis  in 

production  analysis  are  the  Charnes-Cooper  constraints  for  testing  Pareto- 

optimality  of  an  empirical  production  point. 

In  no  way,  as  others,  e.g.  Fare  [l3]  have  mistakenly  asserted,  is 

Data  Envelopment  Analysis  restricted  to  linear  constant  returns  to  scale 

functions  or  to  truncated  cone  domains.  Evidently  via  (3.2),  Data 

Envelopment  Analysis  applies  to  much  more  general  functions,  function  domains 

and  other  situations  than  the  current  empirica1  production  function  one. 

To  test  an  empirical  "input-output"  point  (XQ,  Y  )  for  Pareto- 
2 

optimality,  the  C  (Charnes  and  Cooper)  test  of  (3.2)  becomes 

min  -e^YX  +  e^XA 

subject  to  YA  -  s+  =  Yq 

(3.5)  -XA  -s'  =  -X0 

eF  -  1 

A,  S+,  s’  >  0 

where  X  =  [Xj,..^]  ,  Y  £  [Yj , . .  ,YnD. 

Since  -e^(Y>-Yo)  +  e"'"(XA-XQ)  is  an  equivalent  functional  (it  differs  from 
the  above  one  only  by  a  constant)*  we  can  rewrite  the  problem  for  convenience 
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in  later  comparisons  as: 


T  +  T  - 
min  -e  s  -  e  s 

subject  to  Ya  -  s+  =  Yq 

(3.6)  -XX  -s'  >  -X0 

eT>.  -  1 

with  A,  s+,  s’  3*  0 

This  is  the  new  DEA  form  for  the  production  possibility  set  via  .  As  we 
shall  see  later,  other  variations  of  0,  can  be  accomodated  easily  by 

/  L 

simple  modifications  of  or  additions  to  the  constraints  on  A.  Its  informatics 
and  software  involve  only  minor  modification  from  that  of  the  Charnes, 

Cooper,  Seiford  and  Stutz  paper  [11]  as  developed  by  I.  A1 i  and  J.  Stutz 
for  the  Center  for  Cybernetic  Studies  of  The  University  of  Texas  at  Austin. 


EFFICIENCY  ANALYSIS 


As  mentioned,  managerial  and  program  comparison  aspects  of 
efficiency  analysis  were  initiated  by  Charnes,  Cooper  and  Rhodes  in  [6  ], 

[8],  and  [9],  through  a  generalization  of  the  single  input,  single  output 
absolute  efficiency  determination  of  classical  engineering  and  science  to 
multi-input,  multi-output  relative  efficiencies  of  a  finite  number  of 
decision-making  units  "DMU 1 s"  (sometimes  cal  1  ed  "productive"  units  or  "response" 
units).  The  multi-input,  multi-output  situations  were  reduced  to  the  "virtual"  sing!' 
input  single  output  ones  through  use  of  virtual  multipliers  and  sums. 

Explicitly,  the  CCR  ratio  measure  of  efficiency  of  the  DMU  designated  "o" 
is  given  by  the  non-linear,  non-convex,  non-Archimedean  fractional  program 
(see  [  7 ] ) . 


subject  to 


j  =  1,  n 


(4.1) 


<  -reT 


where  the  entries  of  the  X.  and  Y.  are  assumed  positive,  l  is  a  non- 

J  J 

Archimedean  infinitesimal,  e^"  is  a  row  vector  of  ones  and,  by  abuse  of 
notation,  has  s  entries  for  nT,  m  entries  for  ij .  (X  ,Y  )  one 
n  input-output  pairs. 
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Employing  the  Charnes-Cooper  transformation  of  fractional  programming 

/  a  o\  T  A  T  i  "ly  f  A  r  1"  f  V  1 

(4.2)  p  =  n  /c  XQ  ,  v  =  £  /£  XQ  ,  v  XQ  =  1 
we  obtain  the  dual  non-Archimedean  linear  programs 

max  vJyq  min  0  -ee^s+-ce^s’ 

subject  to  v"^X  =  1  YA  -s+  =  Y 

o  o 

(4.3)  uty-  vTX  <0  GX  -  XA  -s'  =  0 

o 

<  ce^  A,  s+,  s'  >  0 

T  .  T 

-v  ee 

where  X  fe  [Xj,...^]  ,  Y  $  [Yj , . . .  ,Yp]. 

Although,  clearly,  no  assumptions  have  been  made  concerning  the 

type  of  functional  relations  for  the  input-output  pairs  (X.,Y.),  the  dual 

J  J 

program  may  be  recognized  as  having  the  Data  Envelopment  Analysis  constraints 
for  an  empirical  production  possibility  set  of  Farrell,  Shephard,  etc.  cone 
type  'J  B'jC,  and,  since 

(4.4)  9  -  c[eTY\  -  eTXA] 

is  an  equivalent  form  for  the  functional,  as  being  a  Charnes-Cooper  Pareto- 

optimality  test  for  (DXq,Yo)  over  the  cone  on  the  (X.,Y^),  j=l,...,n,  with 

pre-emption  on  the  intensity  0  of  input  X  .  As  shown,  for  example,  in  [7  ], 

DMU  is  efficient  iff  0  =  1,  s  +  =  0,  s  "  =  0. 

o 

Re  informatics,  which  are  particularly  important  since  ajj  n 
efficiency  evaluations  must  be  made  (i.e.,  n  linear  programs  must  be  solved), 
the  dual  problem  can  be  computed  exactly  (in  the  base  field)  as  shown  in  [5], 
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e.g.,  with  the  code  NONARC  of  Dr.  I.  AT  i  (Center  for  Cybernetic  Studies,  The 
University  of  Texas  at  Austin),  or  approximately  by  using  a  sufficiently 
small  numerical  value  for  e.  A  typical  efficient  point  is  designated  by 
(x,y)  in  Figure  1. 

★ 

If  a  DMU  is  inefficient,  the  optimal  X  >  0  in  its  DEA  problem 
(=Charnes-Cooper  test)  designate  efficient  DMU's.  Thus,  a  ,,proper,,  subset  of 
the  efficient  DMU's  determines  the  efficiency  value  of  an  inefficient 
DMU.  The  convex  combinations  of  this  subset  are  also  efficient.  Thereby 
to  each  inefficient  DMU  a  "facet"  of  efficient  DMU's  is  associated.  The 


transformation 


(4.5) 


x  -*■  e*x 
o  o 


s  ‘  ,  Y  -*■  Y  +  s 
o  o 


where  the  asterisk  designates  optimality,  projects  DMUo>  i.e.,  (Xq,Yo),  onto 
its  efficiency  facet. 

This  projection  was  employed  by  Cbarnes,  Cooper  and  Rhodes  [  9 ]  to  correct 
for  differences  in  managerial  ability  in  their  analysis  of  programs  Follow- 
Through  and  non-Fol low-Through.  It  also  shows  quantitatively  what  improvements 
in  inputs  and  outputs  will  (ceteris  paribus)  bring  a  DMU  to  efficient  operation. 
Thus,  although  the  relative  efficiency  measure  of  an  inefficient  DM'J  will 
involve  the  infinitesimal  e,  non-infinitesimal  changes  for  improvement  are 
suggested. 

Both  Farrell  and  Shepard  knew  that  ratio  measures  required  adjustments 
to  correctly  exhibit  inefficiency  of  the  second  DMU  in  examples  like  the 


following  2  input,  1  output,  2  DMU  case: 


t  r  f  / 


Farrell  added  geometric  points  at  inf inity;  Shephard  simply  excluded  such 
cases  without  giving  a  method  for  their  exclusion.  The  non-Archimedean 
extension  in  the  CCR  formulation  is  necessary  to  have  an  algebraically 
closed  system  of  linear  programming  type.  Linear  programming  theory  holds 
for  non-Archimedean  as  well  as  Archimedean  entries  in  the  vector  and  matrix 
problem  data. 


2r2 


Our  new  Pareto-optimal  DEA  method  like  C  S  [11]  associates  facets  with 


non-optimal  (=non-Pareto-eff icient)  DMU's.  Clearly,  by  the  C  -test,  DMU 


t  *+  t  *_ 

is  Pareto-efficient  (Pareto-optimal)  iff  -e  s  -  e  s  =0,  i.e.,  iff  the 


^-distance  from  (Xq,Yo)  To  the  farthest  "northwesterly"  (X^.Y^)  point  is  zero. 
The  CCR  efficient  DMU's  are  also  among  the  new  Pareto-optimal  DMU's.  Projection 
of  a  non-optimal  DMU  onto  its  Pareto-efficient  facet  is  rendered  by 

*+ 


(4.6) 


Xn  -  X 
o  o 


Yo  +  s 


To  achieve  a  convenient  efficiency  measure,  we  modify  the  functional  by 
multiplying  it  by  a  d  >  0  and  consider 

(4.7)  -feTs  + 


f  T  *- 
fe  s 


where  the  asterisk  denotes  optimality,  as  the  logarithm  of  the  efficiency 
measure.  When  the  data  in  X  and  Y  are  scaled  to  lie  between  0  and  IOC,  a 
f  =  l/10(m+s)  will  yield  a  logarithm  between  0  and  -10.  This  measure  might 
then  be  called  the  "efficiency  pH”  by  analogy  with  the  pH  of  chemistry. 

Our  new  measure  relates  to  the  units  invariant  multiplicative  measure 
of  Charnes,  Cooper,  Seiford  and  Stutz  [12].  which  as  shown  there  is  necessary 
and  sufficient  that  the  DEA  envelopments  be  piecewise  Cobb-Douglas,  by  con¬ 
sidering  the  entries  in  the  X.,  Y.  to  be  logarithms  of  the  entries  in  X.,  Y 

.  J  J  J  J 

which  we  employ  in  the  multiplicative  formulation. 


INFORMATICS  AND  FUNCTION  PROPERTIES 


(A)  Partial  Derivatives: 

2  2 

The  guidance  provided  by  the  CCR,  BCC,  C  S  formulations  does  not 

include  convenient  access  to  the  rates  of  change  of  the  outputs  with  change 

in  the  inputs.  The  optimal  dual  variables  in  the  DEA  side  linear  programming 

problems  give  rates  of  change  of  the  efficiency  measure  with  changes  in  inputs 

or  outputs.  The  non-Archimedean  formulations  further  may  give  infinitesimal 

rates,  which  are  not  easily  employed.  And,  for  most  of  the  efficient  points 

one  has  non-differentiability  because  they  are  extreme  points  rather  than 

(relative)  interior  points.  Nevertheless,  because  of  the  informatics,  e.g., 

2 

computational  tactics,  we  employ  in  testing  via  C  for  Pareto-optimal ity, 
the  following  constructive  method  can  be  employed. 

On  reaching  a  non-Pareto  optimal  point,  our  software  discovers  all 
the  optimal  points  in  its  facet,  hence,  implicitly,  all  the  convex  combina¬ 
tions  which  form  the  facet.  Since  the  Pareto-optimal  facet  is  a  linear 
surface  it  is  not  only  differentiable  everywhere  in  its  relative  interior 
but  all  its  partial  derivatives  are  constant  throughout  the  facet.  Thus, 
we  need  only  obtain  these  for  any  relative  interior  point  of  the  facet  to 
have  them  for  the  whole  facet.  Such  a  point  is  the  average  of  the  Pareto- 
optimal  points  of  the  facet. 

Let 

(5.1)  F(x1,..,xfn,  yj.-.-.yj)  =  0 

be  the  linear  equation  of  the  facet.  Since  we  have  sufficient  differentia¬ 
bility  in  the  neighborhood  of  the  average  point  (x,y),  we  know 


where  the  right  side  partial  derivatives  are  also  evaluated  at  (x,y). 


o 

Suppose  we  run  the  C  -test  with  (x,y)  as  the  point  being  tested.  Then 
the  optimal  dual  variables  corresponding  to  input  and  y are  respectively 


Thus,  the  rate  of  change  of  output  y  with 

A. 


respect  to  input  xi  is  simply  the  negative  of  the  ratio  of  the  optimal  dual 
x^  constraint  variable  to  the  optimal  dual  y^  constraint  variable! 

More  specifically,  all  Pareto-optimal  (X.,Y.)  of  the  facet  for  the 

(7)  J  J 

barycenter  (x,y)  satisfy 


(5.4)  y*Ty  -  v*Tx  -  0*  =  0 

where  (y*^,  ,  <fi*)  are  the  dual  evaluators  at  an  optimal  basic  solution, 

o 

since  they  do  not  depend  on  the  C-test  right  hand  sides.  Thereby  our 


(5.5)  F(x,y)  =  y*Ty  -  v*Tx  -  <P*  =  0 

Clearly,  y*  =  3F/3y^  t  -v*  =  3F/3x^  as  already  stated. 
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( B )  Isotonicity  and  Econom ies  of  Scale : 

Theorem  1  shows  that  every  component  of  the  empirical  frontier 

production  function  is  a  concave  function. 

1  2 

Suppose  x  and  x  are  the  inputs  of  two  Pareto-optimal  DMU's  in 

12  1*1  2*2 
the  same  facet  and  x  >  x  .  Since  x  =  XX  (x  )  and  x  =  XX  (x  )  we  must  have 

e^XX  (x1)  >  e^XA  (x2) .  But  for  Pareto-optimal ity,  e^Y>  (x^)  =  e^X'  (x1),  i  =  l,2 

so  that  e^YA  (x^)  ^  e"^YA  (x2).  Then,  letting  fP(x)  denote  the  empirical 

Pareto-optimal  (vector)  function  we  have 

(5.3)  eTfP(x])  >  eTfPfx2) 

Further,  if  xu  =  px1  +  (l-p)x2,  0  <p  <  1,  fp(xu)  -  pfP(x^)  + 

(l-p)fP(x  )  by  construction  of  the  empirical  frontier  function  and  we  have 

eTfP(x1)  ^ueTfp(xu)  >  e"'"fP(x2). 

For  the  single  output  case  of  Farrell,  etc.,  then 
Theorem  3:  If  there  is  only  a  single  output,  the  empi ri cal  Pareto-optimal 
production  function  is  isotonic  in  every  facet  (regardless  of  what  underlying 
production  function  we  have  sampled  from). 

Proof:  A  function  f(x)  is  "isotonic"  iff  xa  >■  xb  implies  f(xa)  ?  f(xb). 
Also  eTfp(x)  =  fp(x)  with  a  single  output. 


Possibly  because  of  ignorance  of  standard  mathematical  terminology, 
the  isotonic  property  has  been  called  "strong  disposability"  in  the  economics 
literature.  The  name  "weak  disposability"  has  also  been  used  for  the 
weaker  property  f(px)  >  f(x)  whenever  p  >  1 .  A  better  name  might  be  "ray 
isotonic. " 
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Our  arguments  preceding  Theorem  3  establish  a  "sum  isotonic"  property 
on  facets  for  the  empirical  Pareto-optimal  function  with  multiple  output 
components  (regardless  of  the  underlying  production  function  set  we  have 
sampled  from),  namely, 

Theorem  4:  e^fP(xa)  ^  e^fP(x^)  whenever  e^xa  >  e^x^  with  xa,  x*3  in  the  same 
facet. 

Classically  in  economics,  production  functions  studied  have  usually 
been  assumed  to  be  homogeneous  and  defined  on  the  non-negative  orthant. 
Thereby,  whether  or  not  a  function  for  which  f(px)  =  paf(x),  with  p  ^  0, 
had  economies  of  scale  would  be  decided  by  the  value  of  the  exponent 
More  generally,  increasing  or  decreasing  "return  to  scale"  would  be  present 
respectively,  at  x  if  f(ox)  >  pf(x)  or  f(px)  <  pf(x)  for  p  > 1  at  points  ;x 
in  a  small  neighborhood  of  x)  The  BCC  paper  [3]  gives  a  criterion  for 
deciding  this  (with  production  possibility  set  U  B  U  C  or  2E  u  B)  but  does 
not  give  us  the  rates  of  change. 

Because  of  our  preceding  theorems,  however,  we  know  that  empirical 
Pareto-optimal  functions  are  sum-isotonic  on  facets  and  concave  in  each 
component  function  regardless  of  the  nature  of  the  underlying  production 
possibility  set.  Thereby,  we  automatically  anticipate  lower  and  lower 
returns  to  scale  in  going  from  facet  to  facet  with  increasing  e^x.  And 
our  partial  derivatives  can  give  us  explicitly  the  rates  of  change  in  each 
observed  facet. 

Practically,  our  choices  of  inputs  are  generally  made  with  the 
expectation  that  the  underlying  Pareto-optimal  function  is  isotonic,  i.e., 
we  choose  the  form  of  the  inputs  so  that  an  increase  in  an  input  should 
not  decrease  the  outputs.  But  even  here  we  need  still  more  to  determine 
the  non-concave  portions  of  an  isotonic  functici .  For  example,  in  Figure  2 
an  isotonic  function  is  plotted  together  with  the  resulting  concave  cap 


(C)  Discretionary  and  Non-Discretionary  I nputs : 

In  a  number  of  practical  appl ications, certain  relevant  inputs,  e.g., 
unemployment  rate,  population,  median  income,  are  not  subject  to  "discre¬ 
tionary"  change  by  the  decision-makers  of  decision-making  units.  These  are 
called  "non-di scretionary"  inputs.  They  are  important  in  influencing  the 
outputs  and  in  furnishing  the  reference  background  in  terms  of  which  units' 
efficiency  is  rated.  Not  infrequently  the  facet  associated  with  an 
inefficient  unit  has  the  same  values  for  the  non-di scretionary  inputs,  in 
which  case  there  is  no  problem  with  the  rating  assigned.  If  not,  however, 
to  obtain  more  meaningful  ratines  we  can  add  constraints  on  X  to  those  in 
(3.5) which  require  the  non-discretionary  inputs  to  be  the  same  as  that  of  the 
unit  being  evaluated.  Thereby,  a  more  meaningful  rating  will  be  attained. 


CONCLUSIONS 


We  have  shown  how  direct  application  of  the  Charnes-Cooper  test 
for  Pareto  optimality  leads  to  a  simpler  and  more  robust  method,  efficiency 
pH,  encompassing  all  previous  ones  for  ascertaining  "efficiency."  Further, 
Pareto-optimal  characterizations  and  constructions  of  empirical  production 
functions  restrict  us  methodologically  to  exploration  of  such  functions 
by  means  of  concave  sum-isotonic  caps.  Economies  of  scale  from  these 
thereby  expectedly  decrease  with  increase  in  the  magnitude  of  the  input 
vectors.  Use  of  transformations  of  outputs,  as  we  suggest,  can  uncover 
non-concave  regions  of  the  underlying  production  function  where  substantial 
economies  of  scale  may  prevail.  Our  new  informatics  device  and  theory 
of  the  use  of  the  facet  average  (or  barycenter)  also  constructively 
furnishes  quantitative  estimates  of  the  rates  of  change  of  outputs  with 
respect  to  inputs  which  have  not  been  available  previously.  These  new 
devices,  as  with  other  usages  of  empirical  functions,  suggest  important 
new  areas  for  development  of  statistical  theory  to  distinguish  between 
true  properties  and  sampling  "accidents."  The  vital  importance  of  further 
development  of  the  informatics  of  solution  of  systems  of  adaptively 
developed  linear  programming  problems  for  Pareto-optimal  constructions 
should  also  be  clear. 
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1 nformatically,  we  are  doing  this  by  applying  transformations  of 

1 

-  -  P 

form  g^  (y^ )  =  +  (y^-y^)  with  3  =  20  to  obtain  possible  new  facets  in 

the  gn(yn). 

Problems  do  arise,  of  course,  on  whether  one  gets  spurious 
empirical  frontier  portions  in  this  manner  for  empirical  points  which 
should  "really"  be  inefficient.  Evidently  such  non-concave  portions  are 
portions  of  increasing  returns  to  scale  if  they  are  truly  on  the  frontier. 
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III.  INVARIANT  MULTIPLICATIVE  EFFICIENCY  AND  PIECEWISE  COBB-DOUCLAS  ENVELOPMENTS 
Introduction 

In  [1],  Charnes,  Cooper,  Seiford,  and  Stutz  (C2S2)  develop  a  multiplicative 
(orlog)  measure  of  the  relative  efficiency  of  multiple  input,  multiple  output 
productive  (or  "decisionmaking")  units  (DMU's).  In  contrast  to  the  CCR  measure 
[2,  3],  the  multiplicative  measure  obtained  in  [1]  is  not  invariant  under  change 
of  units  in  the  inputs  or  outputs.  We  show  here  how  by  a  simple  change  preserving 
the  multiplicative  format  that  a  units  invariant  multiplicative  measure  can  be 
obtained.  Interestingly,  the  Data  Envelopment  Analysis  (DEA)  associated  with 
this  new  modification  necessarily  yields  optimal  envelopments  by  Cobb-Douglas 
functions,  i.e.,  the  efficiency  surface  is  piecewise  Cobb-Douglas  rather  than 
merely  log-linear!  This  uncovers  a  new  role  for  Cobb-Douglas  functions1— they 
are  necessary  for  the  units  invariant  property  of  a  multiplicative  measure. 

Units  Invariant  Multiplicative  Efficiencies 

The  C2S2  multiplicative  model  reduces  the  input-output  quantities  to  single 
virtual  output  to  input  ratios.  If  we  now  introduce  an  additional  virtual  output 
multiplier  and  virtual  input  multiplier,  we  obtain  the  following  form  for  our 
problem  to  measure  the  efficiency  of  DHU, relative  to  all  the  n  DMU’s: 

a  $  uj 

max  (e^  n  YjjH  /  (eC  n  X^h 
n=l  ro  1=1  10 

a  ^  ^  m 

(1)  S.t.  (eTl  n  YS)  /  <eE  n  x”1)  <  1 ,  j  »  1 . r 

r-1  rJ  1*1 

-n  <  0,  -E  <  0,  -ur  <  -6,  -vi  <  -6,  Vr,i, 

^ther  properties  relating  Cobb-Douglas  forms  to  more  general  classes  of 
functions  are  examined  in  [5]. 
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where  6  >  0  and  DMUo  is  one  of  the  n  DMU's  in  the  constraints. 

Suppose  the  units  in  the  outputs  and  the  inputs  are  changed  so  that  Y  . 

r  j 

becomes  a  Y  .  and  X..  becomes  b.X. .  where  a  .b.  >  0,  Vr,i  (Note  a  or  b.  =  1 
i  r  rj  lj  i  i  j  r  i  r  1 

corresponds  to  no  change  in  those  units).  Problem  (1)  becomes 

A  A 

max  en(n  a^)n  /  e?(n  b^')n  X^ 
r  r  i  i  ° 


-n  ^  0,  -£  <  0,  -y  <  -6,  -vi  <  -6,  Vr,i. 

If  (1)  has  optimal  value  E(l)  with  n*»  V*  V*.  v*  an  optimal  solution, 

then  exp(n)  =  Kexp(n*)/n  a^r* ,  exp(£)  *  Kexp(£*)/n  bVi*,  y*.  v*  (where  K  >  0 
-  r  i 

assures  n.y  >  0)  is  "feasible"  for  (2)  with  value  E( 1 ) .  Hence  for  (2)  the  optimal 

value  E(2)  >  E(l).  Similarly  from  an  optimal  solution  n,  y,  v  to  (2)  we 

construct  a  feasible  solution  to  (1)  with  value  E(2).  Thereby  E(l)  <  E(2)  <  E(l) 

i.e.,  the  efficiency  value  is  invariant  under  change  of  units. 

The  Cobb-Douglas  Property 

Taking  logarithms  in  (1)  and  going  to  vector  matrix  notation  as  in  [1], 
we  obtain  the  dual  linear  programming  problems: 


I  II 


A 

max  n 

/V  T  A 

-  5  +  P  Yo 

X  > 
o 

min 

.  T  +  .  T 

-6e  s  -  6e  s* 

~  T 

s . t .  ne 

-  ieT+  yTY 

-  vTX  <  0 

s.t.  e^X 

1 

CD 

+ 

It 

1 

-n 

<  0 

-eTX 

-e" 

-1 

-  1 

<  0 

YA 

+ 

-s  = 

Y<> 

T 

-y 

<  -6eT 

A 

-XX 

-s’  = 

-X 

-vT  K  -6e"^ 

X. 

9+,  9’,  S+,  S_  > 

0 
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Here  II  represents  the  DEA  side  of  the  efficiency  problem.  Adding  the  first 
two  equations  in  II,  we  obtain  -6+  -  0~  =  0.  Since  0+,  0~  >  0  we  must  have 
9+  *  0“  =  o.  Thus  II  reduces  to 

.  T  +  ,  T  - 

min  -Se  s  -  6e  s 


(4) 


s.t.  YX  -  s 
-XX 


eTX 


=  Y0 
-s'  =  -X0 
=  1 


X,  s  ,  s  >  0 


A  A 


Thereby  we  have  Yo  and  X0  enveloped  by  convex  combinations  of  the  Y.,  X..  With 

w  *J 

optimal  solutions  X*,  s*+,  s*-,  we  can  write 


(5) 


n  *+ 
Y0  =  n  Y.Je'sj 
3  =  1  J 

"  X*  s^‘ 
x0  =  n  X.JesJ 

j=l J 


where  I  Xj  =  1 


XixT 


,  X^)T. 

nu 


and  by  Yjj  ,  resp.  XAJ  ,  we  mean  (YAJ . YAJ)'  resp.  (XAJ,  . 

J  J  1j  sj  ij 

Thus  our  optimal  envelopments  are  by  Cobb-Douglas  functions  with  Xj  >  0  implying 


that  DMU .  is  efficient,  i.e.,  DMU0  is  associated  with  the  efficiency  surface 

J 

"facet"  spanned  by  those  DMU.'s  for  which  xt  >  0. 

J  J 

We  note  further  that  the  simplified  dual  programs  corresponding  to  (3)  are 


now 


LH 

•p  T~ 

max  yYo-vXo+w 


un 

min  -5eTs+  -  6eTs' 
YX  -  s+ 


(6) 


s.t. 


y^Y  -  v"^X  +  oje"^  <  0 


s.t. 


These  results  present  us  with  a  new  method  for  estimating  piecewise  Cobb- 
Douglas  production  functions  directly  from  empirical  data.  The  form  of  (II‘) 
in  contrast  to  that  of  [4]  is  also  sufficiently  simple  that  one  can  anticipate 
that  the  mathematical  statistics  of  this  type  of  Cobb-Douglas  estimation  may 
well  be  developed  in  the  near  future  (see  also  the  Appendix  in  [3]). 


29 


REFERENCES 


[1] 

[2] 

[3] 

[4] 

[5] 


Charnes,  A.,  Cooper,  W.  W. ,  Seiford,  L.,  and  Stutz,  J.,  "A  Multiplicative 
Model  for  Efficiency  Analysis,"  Socio-Economic  Planning  Sciences  (forthcoming 
1983);  also  CCS  416,  Center  for  Cybernetic  Studies,  The  University  of  Texas, 
Austin,  Texas,  November  1981. 

Charnes,  A.,  Cooper,  W.  W. ,  and  Rhodes,  E.,  "Measuring  the  Efficiency  of 
Decionsionmaking  Units,"  European  Journal  of  Operational  Research,  v.  2, 
pp.  429-444,  1978.  ~  ' 

Charnes,  A.  and  Cooper,  W.  W.,  "Management  Science  Relations  for  Evaluation 
and  Management  Accountability,"  Journal  of  Enterprise  Management,  v.  2, 
pp.  143-162,  1980. 

Banker,  R.  D. ,  Charnes,  A.,  Cooper,  W.  W.,  and  Schinnar,  A.  P.,  "A  Bi- 
Extremal  Principle  for  Frontier  Estimation  and  Efficiency  Evaluations," 
Management  Science,  v.  17,  No.  12,  pp.  1370-1382,  December  1981. 

Charnes,  A.,  Cooper,  W.  W. ,  and  Schinnar,  A.  P.,  "A  Theorem  on  Homogeneous 
Functions  and  Extended  Cobb-Douglas  Forms,"  Proc.  Natl.  Acad.  Sci.,  U.S.A., 
73,  No.  10,  pp.  3747-3748,  October  1976. 


o 


IV.  A  COMPARITIVE  STUDY  OF  DATA  ENVELOPMENT  ANALYSIS  AND  OTHEtf  APPROACHES 
TO  EFFICIENCY  EVALUATION  AND  ESTIMATION 

1 .  Introduction 

Data  Envelopment  Analysis  (DEA)  is  a  new  efficiency  measurement 
methodology  developed  by  A.  Charnes,  W.  W.  Cooper,  and  E.  Rhodes  as  set 
forth  i.)[l2]  [13]  and  [14]V  It  is  designed  to  measure  the  relative 
efficiency  of  Decision  Making  Units  (DMUs)  which  use  multiple  inputs  to 
produce  multiple  outputs  even  when  the  underlying  production  function  is 
not  known  and  where,  additionally,  these  functions  may  also  be  multiple 
in  character.  This  contrasts  with  the  situation  for  statistical 
techniques  and  theory,  e.  g.,  as  employed  in  economics,  where  either  the 
underlying  production  function  must  be  known,  or  at  least  its  parametric 
form  must  be  assumed  before  it  can  be  used  to  evaluate  efficiencies  and 

where,  usually,  a  single  functional  form  is  also  assumed.  See,  e.  g., 

Feldstein  [18].  See  also  [32]  and  [33].  The  latter,  regression  approaches, 

are  thus  limited,  especially  in  the  case  of  public  sector  institutions  such 
as  hospitals,  etc.,  where  programs  and  activities  are  even  less  readily 
identified  for  such  assumptions  than  is  the  case  in  industrial  production. 

DEA  has  now  been  applied  to  several  types  of  organizations  including 
education  [5]  [6],  health  care  [4]  [29],  Navy  recruiting  [22],  and  criminal 
court  systems  [21].  Nevertheless  something  more  is  required  and,  in 
particular,  the  validity  and  reliability  of  DEA  in  locating  inefficient 
DMUs,  identifying  the  inputs  (and/or  outputs)  where  the  inefficiencies 
occur  and  estimating  their  amounts  or  magnitudes  all  need  to  be  evaluated. 

One  way  to  approach  this  task  is  via  a  situation  in  which  the  identity  of 
the  truly  inefficient  units  is  known  along  with  the  sources  and  amounts  of 
this  inefficiency.  This  paper  therefore  attempts  to  evaluate  DEA  through  use 


of  an  artificial  data  base  where  the  efficient  and  inefficient  DMUs  are  all 


known  in  numerical  detail.  DEA's  performance  is  then  compared  with  other 
commonly  employed  techniques  such  as  ratio  and  regression  analyses. 


Regression  and  ratio  analyses  were  selected  for  these  evaluations 
because  they  are  widely  used  in  fields  like  health  services,  which  is  the 
field  we  shall  use  to  guide  our  data  base  construction.  In  this  paper 
we  restrict  our  examination  only  to  some  of  the  fairly  simple  forms  of  ratio 
and/or  regression  approaches  that  are  in  wide  use.—' ■  More  sophisticated 

rearession  techniques  such  as  the  translog  function  and  other  so-cal led'flexible 

functional  form"  approaches  are  considered  elsewhere.  See  Sherman  [29l. 

The  following  section  describes  how  the  data  base  was  constructed  and 

section  3  discusses  the  data  base  that  was  developed.  Section  4  describes 
the  version  of  DEA  that  will  be  used  while  sections  5,  6  and  7  discuss  the 
results  of  applying  DEA,  ratio  and  regression  analyses  to  this  data  base.  The 
resulting  comparisons  are  summarized  in  section  7  with  respect  to  the 
ability  of  these  techniques  to  identify  and  distinguish  between  efficient 
and  inefficient  DMUs.  Section  8  then  extends  the  uses  of  DEA  to  locating 
and  estimating  the  amounts  of  inefficiencies  in  particular  DMUs  in  ways 
that  are  not  generally  available  when  the  ratio  or  regression  approaches 
are  used.  A  concluding  section  then  discusses  some  of  the  shortcomings  found 
in  these  other  approaches  and  indicates  where  they  differ  from  DEA  and 
how  some  of  their  shortcomings  might  be  repaired. 

]_/  Similarly, only  one  version  of  DEA  is  used  and  no  attempt  is  made  to 
distinguish  between  various  types  of  efficiencies  such  as  scale  vs. 
technical  efficiencies  and  other  sources  of  inefficiency  such  as  are 
examined  in  [3].  Finally,  we  did  not  use  statistical  techniques  to 
develop  our  data  base,  as  was  done  in  [2],  and  hence  can  make  only 
limited  use  of  statistical  significance  tests  and  like  devices  for 
generalizing  our  results.  Our  purpose  is  rather  to  supply  insight 
of  potential  value  on  the  use  of  the  techniques  we  study  rather  th3n 
to  secure  generalizations  for  the  different  data  situations  that 
might  be  encountered  in  actual  practice. 
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2.  Model  Structure  and  Data  Generation 


The  artificial  data  set  was  constructed  by  defining  a  hypothetical ly 
"known"  technology  which  applies  to  all  Decision  Making  Units  (DMUs)  and 
defines  efficient  input-output  relationships  for  each  of  them.-^ 
Inefficiencies  which  were  explicitly  introduced  for  certain  DMUs  take  the 
form  of  excess  inputs  used  for  the  output  levels  attained.  Hence,  a  DMU 
that  achieves  its  output  level  by  using  only  the  amount  of  inputs  required 
by  this  hypothetical  technology  is  efficient  while  a  DMU  that  uses  more 
than  the.required  amount  of  any  input  is  inefficient.  To  make  the  inputs 
and  outputs  easier  to  recognize,  they  are  referred  to  and  labelled  in  the 
context  of  a  hospital  study  as  one  area  of  potential  interest.  See 
Sherman  [29].  We  assume  that  these  hospitals  are  all  public  (not-for- 
profit)  institutions  so  that  the  usual  profit  calculus  and/or  price-weighted 
reductions  to  a  scalar  measure  of  efficiency  evaluation  are  not  wholly 
appropriate. 


1/  Knowledge  gained  from  the  study  of  Massachusetts  hospitals  reported  in 
[29J  was  used  in  the  cnoice  of  inputs  and  outputs  and  in  the  construction 
of  the  data  set. 
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The  set  of  artificial  hospital  data  generated  for  our  simulation  consisted 
of  three  outputs  produced  with  three  inputs  during  a  one  year  period  of  time— ^ 
as  follows: 


:  Regular  patient*  care/year 

(patients  treated  in  one  year 
with  average  level  of  inputs 
for  treatment) 

y2  ’•  Severe  patient*  care/year 

(patients  treated  in  one  year 
with  severe  illness  requiring 
higher  input  levels  than 
regular  patients  for  more 
complex  treatment). 

y-j  :  Teaching  of  residents 
and  interns/year 
(number  of  individuals 
receiving  one  year  of  training) 


x^:  Staff  utilized  in  terms 

of  full-time  equivalents, 
i.e.,  (FTE  s)/year 


X2 :  Number  of  hospital  bed 

days  available/year 


X3:  Supplies  in  terms  of 

dollar  cost/year 


♦measured  in  terms  of  number  of  patients  treated 

The  data  set  to  be  generated  was  for  15  hypothetical  hospitals  which  we 

label  as  HI,  H2,  ....  Hi 5,  to  represent  the  pertinent  DMUs.-^  They  are  all 

assumed  to  cchieve  their  outputs  via  a  common  production  process,  which 

they  may  use  efficiently  or  inefficiently.  The  resulting  observed  values 

are  then  constructed  in  a  manner  that  we  shall  shortly  describe. 

In  this  study  we  shall  focus  on  input  inefficiencies,  by  which  we 

mean  that  one  or  more  of  the  above  inputs  may  be  used  in  excess  to  obtain 

a  particular  hospital's  output  values.  Although  we  could  also  similarly 

study  output  deficiencies  (in  the  form  of  output  shortfalls  from  given 
3/ 

inputs)"we  shall  not  lengthen  the  paper  to  undertake  that  study  here. 

In  any  case  the  known  values  of  the  per  unit  inputs  for  efficient  pro¬ 
duction  are  given  in  Exhibit  1  inserted  at  the  end  of  this  paper. 
l_/l-  e.,  we  are  considering  all  data  as  annual  rates. 

^/Subdivisions  may  also  be  used  such  as,  e.  g.,  the  suraical  units  within  p?ch 
hospital  that  were  studied  in  [29  ]. 

3/An  output  shortfall  approach  from  given  inputs  is  used  in  [2]. 
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The  usual  regression  approach  to  efficiency  and  related  types  of 
economic  analyses  in  multiple  output  situations  uses  a  single  aggregate 


function  of  a  linear  or  logarithmic  variety  in  which  total  cost  is 
regressed  against  the  observed  output  values.  See,  e.  g.,  [18].  This 
approach  carries  with  it  a  variety  of  assumptions-^  which  we  shall  try  to 
favor  in  our  construction  by  using  the  same  prices  and  a  common  technology 
for  all  DMUs.  We  shall  not  assume  that  all  DMUs  operate  on  their  efficiency 
frontiers,  however,  but  we  shall  otherwise  proceed  in  accordance  with  the 
usual  methods  of  estimation,  testing  and  analyses  that  have  been  commonly 
employed  in  regression  studies  of  health  services  and  related  fields. 

Ter  make  the  sense  of  this  discussion  more  precise,  we  present  our 
expressions  for  generating  the  inputs  required  for  efficient  operations 
by  any  hospital  in  the  following  form: 


3 

*ij  =  a  irj  JVj  H) 

where 

x..  =  amount  of  input  i  used  per  year  by  hospital  j 

l  J 

y  •  =  amount  of  output  r  produced  per  year  by  hospital  j 

*  J 

a^rj  =  amount  of  input  i  used  per  unit  of  output  r  by 
hospital  j  during  the  year. 

I /  SeFT  e.  g.,  Sato  [27]. 

2/  A  use  of  DEA  to  distinguish  coefficients  for  input-output  analyses 
derived  from  data  for  efficient  and  inefficient  sets  of  operations  nay 
be  found  in  Schinnar  [28]. 


35 


These  a^rj  values,  which  are  fixed  constants,  represent  an  efficient  set 
of  coefficients  which  may  be  used  to  generate  the  inputs  required  for  any 
observed  (or  planned)  level  of  outputs.  In  some  cases  we  will  assign 

A 

values  a_.^  >  a,.^  for  some  i,  r  and  j  to  represent  managerial  (=  hospital) 
inefficiencies  which  yield  values 


3 
Z  a 


r*l 


irj 


(2) 


with  x^j  >  Xj_j  when  inefficiencies  are  present. 


The  efficient  values  are  given,  free  of  any  of  the  j  =  1,...,  15 


hospital  identification  subscripts,  in  Exhibit  1.  These  values  are  the  same 
for  all  hospitals  so  that  a^  ■  .004  FTE/patient  represents  the  efficient 
labor  requirement  in  Full  Time  Equivalent  units  per  regular  patient. 
Similarly  a^  “  *005  FTE/patient  represents  the  efficient  requirement  for  a 
severe  patient  and  ■  .03  FTE/training  unit  represents  the  efficient 
requirement  to  train  one  new  resident/intern  during  a  year. 

Analogous  remarks  apply  to  the  values  a^  =  7  bed  days/patient,  and 


a22  =  9  bed  days/patient  for  regular  and  severe  patients,  respectively, 


shown  in  the  Bed  Days  column  of  Exhibit  1.  The  blank  shown  in  the  row 
for  Training  Units  in  this  column  means  that  a^  =  0  applies.  That  is, 
no  Bed  Days  enter  into  the  training  outputs. 

Finally,  a^  =  $20/patient  and  a^  =  $30/patient  represent  the 
efficient  level  of  supplies  required  per  regular  and  severe  patients, 
respectively,  while  a^-j  =  $500/training  unit  is  the  coefficient  for 
efficient  training  operations  in  output  r  =  3.  Putting  this  i  =  3  input 
in  dollar  units  avoids  the  detail  that  would  otherwise  be  needed  to 
identify  the  different  types  of  supplies  that  would  be  required  for 
teaching  and  for  different  types  of  patient  treatments. 


DEA  does  not  require  reductions  to  cost  equivalents.  The  various 

outputs  and  inputs  may  be  specified  in  different  units  of  measure  and, 

indeed,  it  can  be  shown  that  the  resulting  DEA  efficiency  value  is 

independent  of  the  units  of  measure  used  in  any  output  or  input.— ^  On 
the  other  hand  reductions  like  these  are  required  for  the  ratio  and 

regression  measures  we  shall  also  study.  Therefore  we  next  show  how  the 

efficient  costs  are  derived  to  obtain  this  part  of  our  data  set.  This 

is  done  via  expressions  of  the  form, 


V  '  ki  air 


r  =  1,  2,  3, 


where  we  have  omitted  the  index  j  for  hospital  identification  because  only 
efficient  costs  are  being  considered.  Here  represents  the  cost  of  the 

i  input  requirement  for  the  r^  output  under  efficient  operations  where 
k]  =  $1 0,000/FTE 

k2  =  $10/bed  'day  (4) 

k3  =  $l/supply  unit. 

These  data  are  then  combined  with  the  preceding  a.,  values  to 
obtain 

c.j  =  ^a.^  +  k2a21  +  k3a31  =  $130/regular  patient 

c2  =  kla12  +  k2a22  +  k3a32  =  $170/severe  patient  (5) 

C3  =  kla13  +  k2a23  +  k3a33  =  ^500/training  unit. 

These  are  the  formulas  used  at  the  bottom  of  Exhibit  1  to  produce 

the  efficient  cost  of  outputs  shown  in  the  last  column  in  the  body  of  the 


Provided,  of  course,  that  these  same  units  of  measure  are  used  for  the 
specified  output  (or  input)  in  the  data  for  every  DMU.  See  Charnes, 

Cooper  and  Rhodes  [10  ].  See  also  Rhodes  [25  ]  and  Charnes  and  Cooper  [  7  ] 
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3.  Data  Base  Development 


We  now  turn  to  Exhibit  2  which  reflects  the  composition  of 
inefficient  and  efficient  hospitals  included  in  our  data  base.  The 
hypothesized  "actual"  (or  observed)  inputs  per  unit  output  used  by  each 
hospital,  whether  efficient  or  not,  are  listed  in  Exhibit  2,  columns  9-16 
with  inefficient  input  levels  per  unit  of  output  denoted  by  CD- 
Column  17  reflects  the  actual  vacancy  rate  {%  of  unused  bed  days  available 
during  the  year)  where,  as  noted  in  Exhibit  1,  an  efficient  hospital  is 
expected  to  have  a  5%  vacancy  rate. 

We  develop  the  actual  inputs  used  for  each  hospital  in  the  manner  we 
have  already  described  by  first  selecting  an  arbitrary  set  of  output  values 
for  each  of  the  hospitals  listed  in  the  left-hand  stub. -  Teaching  units 
per  year  are  reflected  in  column  6,  regular  patients  treated  during  the 
year  are  in  column  7,  and  severe  patients  treated  during  the  year  are  in 
column  8. 

Other  ways  of  summarizing  patient  care  outputs  for  later  use  are 
included  in  columns  4  and  5.  Column  4  reflects  total  patients  as 
the  sum  of  column  7  and  column  8.  Column  5  reflects  the  percentage  (%) 
of  severe  patients  treated  which  is  based  on  (column  8)  4-  (column  4)  X  (100). 

We  develop  this  percentage  output  measure  because  it  reflects  output  data 
in  a  form  which  is  often  used  to  evaluate  efficiency  in  many  real  data  sets.-^ 

The  inputs  used  by  each  hospital  to  produce  the  outputs  in  columns  6,  7, 
and  8  are  reflected  in  columns  1,  2,  and  3.  Column  1  contains  the  full  time 
equivalents  (FTE  s)  of  labor  years  used.  Column  2  has  the  bed  days/year 

which  were  available  and  column  3  gives  the  supply  dollars  used  during  the  year. 

\J  Although  these  values  could  have  been  selected  by  statistical  principles-- 
e.g.,  of  an  experimental  design  variety--there  seemed  to  be  little  point  in 
doing  so  because  our  objective  was  to  secure  insight  rather  than  the  kinds  of 
general izabil ity  that  require  statistical  tests  of  significance.  See  [2], 
however,  for  a  study  of  the  latter  type. 

2/  See  the  discussion  in  Sherman  [29]. 


The  values  in  columns  1,  2,  3  reflect  mixtures  of  efficient  and 
inefficient  utilization  of  resources  because  of  the  way  they  were  derived. 

We  can  clarify  this  by  means  of  Exhibit  3  which  illustrates  how  the  data 
for  HI,  an  efficient  DMU,  and  HI 5,  an  inefficient  DMU,  were  constructed. 

HI  is  efficient  and  therefore  used  the  same  inputs  per  unit  outputs  as 
the  structural  model  in  Exhibit  1.  During  the  year,  HI  provided  care  for 
3000  regular  patients,  2000  severe  patients,  and  50  training  units  of 
service.  It  therefore  utilized  (.004)(3000)  +  (.005)(2000)  +  ( . 03 ) ( 50)  =  23.5 
FTE  s  irv-that  year.  HI 5  produced  the  same  outputs  as  HI  but  was  inefficient 
in  its  use  of  certain  inputs.  It  used  .005  FTEs  /regular  patient,  while  it 
adhered  to  the  structural  model  FTE  usage  rates  for  severe  patients 
(.005  FTEs  /patient)  and  training  (.03  FTEs  /training  unit).  HI 5  therefore 
used  ( .005) (3000)  +  ( .005) (2000)  +  (.03)(05)  =  26.5  FTEs  /year  to  produce 
the  same  outputs.  Similarly,  H15  is  inefficient  in  the  number  of  bed  days 
used  and  supply  dollars  used  per  regular  patient  but  is  efficient  in  the 
amount  of  bed  days  and  supply  dollars  consumed  for  severe  patients  and  for 
supply  dollars  used  for  teaching  outputs.  Bed  days  and  FTEs  and  supply 
dollar  inputs  are  also  calculated  in  Exhibit  3  to  further  illustrate  the 
way  the  data  base  was  constructed. 

The  number  of  FTEs  ,  bed-days,  and  supply  dollars  inputs  were  calculated 
as  illustrated  in  Exhibit  3  for  each  hospital  based  on  the  arbitrarily 
assigned  output  mix  of  regular  patients,  severe  patients  and  training  units 
and  actual  efficient  or  inefficient  input  per  unit  output  rate  reflected 
in  Exhibit  2. 


Certain  relationships  posited  in  the  structural  model  are  generally 


not  known,  like  the  actual  amount  of  staff  time  and  supplies  that  are 
required  to  support  each  intern  or  resident  at  a  hospital.  We  nevertheless 
explicitly  introduce  these  relationships  to  determine  if  the  efficiency 
measurement  techniques  we  will  apply  can  uncover  them.  Before  proceeding, 
however,  it  should  perhaps  be  noted  that  when  the  underlying 
structural  model  is  known,  the  determination  of  which  DMU s  are  inefficient 
can  be  directly  determined  and  techniques  such  as  we  will  be  considering 
would  be  unnecessary  for  purposes  of  efficiency  evaluation. 

4.  The  PEA  Model : 

The  Charnes  Cooper  Rhodes  (CCR)  model  for  data  envelopment  analysis 
which  we  will  use  assumes  the  following  form: 


Objective: 


r=l  Uf  yr° 

max  h„  =  — - 

o  >  m 

Z  w.  x. 
i=l  1  10 


Constraints : 


Less  than 
Unity 

Constraints 


Z  u  y  • 
>  r=l  r  rJ 
m 

Z  w.  x. . 
i=l  1 


»  j  -  1 , ...» 1  5 


(6) 


Positivity  .  0  <  ur  ;  r  =  1 , . . .  ,s 
Constraints*  0  <  Wj  ;  i  =  l,...,m 

Data: 

Outputs:  yrj  =  observed  amount  of  rth  output  for  jth  hospital 
Inputs:  Xjj  *  observed  amount  of  i^  input  for  j^  hospital. 


!_/  Other  models  which  might  have  been  used  can  be  found  in  [  3  ]  and  [  15  ]. 
See  also  [  16  ]. 


This  model  is  therefore  in  fractional  programming  form  with  fractional 
constraints.  As  noted  in  Charnes,  Cooper  and  Rhodes  [13  ]  it  may  be 
replaced  by  an  ordinary  linear  programming  model  that  also  has 
non-Archimedean  conditions  imposed  on  the  variables  for  what  are  here 
referred  tQ  95  positivity  constraints.-^ 

We  shall  not  enter  into  this  kind  of  development  but  shall  instead 
try  to  explicate  what  is  happening  in  our  DEA  analysis  by  means  of  the 
above  model.  First  we  observe  that  the  efficiency  ratings  are  all 
restricted  to  an  upper  limit  of  unity.  One  of  the  j  =  1,...,15  hospitals, 
when  singled  out  for  efficiency  evaluation, is  represented  in  the  objective 
as  well  as  the  constraints.  By  virtue  of  the  latter  condition  we  must 
have  max  hQ  =  hQ*  £  1.  Furthermore  all  observations  y^  and  x. .  are 
positive  so  that, together  with  the  positivity  imposed  on  the  variables, 
we  will  also  have  0  <  hQ*  <  1  with  hQ*  =  1  when  and  only  when  DMUo>  the 
DMU  being  evaluated,  is  efficient. 

Qualifications  need  to  be  entered  to  allow  for  the  presence  of  slack 

2/ 

in  the  corresponding  linear  programming  model.-  We  will  not  treat  this  topic 

in  rigorous  detail  in  the  present  paper  but  will  instead  supply  an  illustration 

with  accompanying  discussion  that  will  provide  insight  into  what  is 

involved.  Here  we  need  only  say  that  when  slack  is  present  in  some  input 

then,  with  efficiency,  that  input  may  be  reduced  to  a  new  input  level  by 

1/  See  Charnes,  Cooper,  Lewin,  Morey  and  Rousseau  [11  ]  for  a  precise 
development. 

2/  Any  slack  which  occurs  in  (6)  is  simply  the  complement  of  an  efficiency 
rating  but  the  development  in  [  11  ]  provides  a  way  of  identifying  the 
presence  of  non-Archimedean  values  in  (6)  with  slack  in  the  corresponding 
linear  programming  model. 
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removing  the  slack  witnout  affecting  any  output  or  any  other  input.  Hence 
the  input  which  involved  this  slack  was  excessive  and  the  operation  could 
not  nave  been  efficient. 

Bearing  this  in  mind  we  next  initiate  our  DEA  analysis  by  reference 
to  the  data  of  Exhibit- 2  after  which  we  shall  attempt  to  compare  the 
resulting  efficiency  ratings  with  cost  ratio  and  regression  approaches 
applied  to  this  same  data  base. 

5.  Applications  to  Artificial  Data  Base. 

Applying  (6)  to  Exhibit  2  with  each  of  HI , ....  HI  5  inserted  in  the 
objective  produces  the  h*  values  reported  in  Table  1.  Every  one  of  the 
efficient  DMU's  has  received  a  rating  of  h*  =1  but  two  inefficient  DMU's-- 
H10  and  HI 3— are  also  accorded  a  value  of  h*  =  1  even  though  they  are 
inefficient.  The  six  DMU's  that  are  rated  as  inefficient,  with  h*  <1, 
are  accorded  these  values  by  comparison  with  certain  efficient  units  that 
comprise  an  efficiency  reference  set  for  the  inefficient  DMU  (see  Table  1). 
For  example,  H8  was  found  to  be  inefficient  by  direct  comparison  with  H4; 
and  HI 5  is  being  compared  directly  with  H4,  H6,  and  H7.  This  reference 
set, we  need  only  note  here,  is  supplied  as  part  of  the  optimum  basis  in 
the  linear  programming  computations.  Hence  the  model  and  computing 
routines  supply  what  is  wanted  without  extra  effort  and,  furthermore, 
the  appearance  of  a  DMU  as  part  of  an  optimal  basis  ensures  that  it  is 
efficient  so  that  separate  computations  need  not  be  made  for  these 
entities  if  that  is  all  that  is  wanted. 

1/  Computer  codes  are  available  for  effecting  these  computations.  See  [  6  ] 
New  software  by  I.  A1 i  and  J.  Stutz  is  also  available  from  the  Center  for 
Cybernetic  Studies  at  The  University  of  Texas  at  Austin  which  detail  the 
efficient  facets  observed. 


It  might  be  observed  that  the  two  inefficient  DMU's  that  were 
accorded  efficiency  values  of  h*  =1  have  no  such  reference  sets.  This 
suggests  that  they  have  special  properties  which  can  be  submitted  to 
further  analysis  by  means  of  the  non-Archmidean  formulations  that  we 
touched  on  earlier  in 'the  textr^  We  shall  not  turn  aside  to  deal  with 
that  topic.  Instead  we  shall  simply  accept  this  identification  of  H10 
and  H13  as  a  possible  weakness  of  DEA  in  the  comparisons  we  are  making 
with  other  techniques  since  (as  in  this  case)  it  can  happen. 


1/Note  also  that  neither  Hi 0  nor  H13  enter  into  the  reference  set  for  any 
other  DMU  . 


Table  1 


DEA  Efficiency  Efficiency 

Efficient  DMU's  Rating  (E)  Reference  Set 


HI 

1.0 

** 

H2 

1.0 

i 

H3 

1.0 

H4 

1.0 

H5 

1.0 

i 

H6 

1.0 

t 

H7 

1.0 

* 

L' 

k 

k', 

k-’ 


Inefficient  DMU's 

DEA  Efficiency 
Rating  (E) 

Efficiency 
Reference  Si 

H8 

0.99 

H4 

H9 

0.98 

HI,  H2 ,  H6 

H10 

1.0 

Hll 

0.85 

H4,  H7 

H12 

0.99 

HI,  H4 ,  H6 

HI  3 

1.0 

H14 

0.99 

HI,  H4 ,  H6 

H15 

0.87 

H4,  H6 ,  H7 
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We  now  consider  how  a  manager,  e.  g.,  in  a  rate  setting  conmission  for 
some  state,—' ■  might  determine  which  DMUs  are  more  and  less  efficient  when 
using  ratios,  a  widely  used  form  of  analysis  to  evaluate  financial  and 
operating  performance.  In  this  example,  all  the  inputs  are  jointly  used  by 
these  DMUs  to  produce  three  outputs  so  that  we  cannot  proceed  as  we  might 
in  the  single  output'case.  A  number  of  different  ratios  might  be  developed 
to  evaluate  different  sets  of  relationships  such  as  FTEs/patient, 

FTEs/severe  patient,  FTEs/regular  patient,  FTEs/teaching  output, 
bed  days/patient,  bed  days/severe  patient,  etc.  Such  a  set  of  ratios  does 
not  explicitly  recognize  the  joint  use  of  these  inputs  to  produce  these 
various  outputs.  In  addition,  for  the  set  of  ratios  calculated,  a  DMU  may 
be  among  the  highest  (least  efficient)  for  certain  ratios  and  lowest  (most 
efficient)  for  other  ratios.  This  leads  to  some  ambiguity  as  to  whether 
that  DMU  is  efficient  or  inefficient  and  calls  for  some  method  of  weighting 
or  ordering  the  importance  of  the  ratios  to  gain  some  overall  assessment  of 
efficiency  such  as  was  .generated  using  DEA  in  Table  1. 

Rather  than  address  this  issue  directly,  we  will  focus  on  a  type  of 
unit  costing  ratio  analysis  that  is  often  applied  to  hospitals  and 
other  organizations  to  evaluate  DMU  performance.  By  design  we  can  say 
that  all  15  hospitals  (DMUs  )  paid  the  same  price  per  unit  for  each  type 

of  input  and  thus  ignore  possible  difficulties  which  arise  for  a  ratio 
analysis  when  this  is  not  the  case.  That  is,  we  can  combine  the  inputs  into 
dollar  units  without  the  confounding  effect  of  differing  input  costs.  Rather 

y  For  instance,  see  [23]  and  [24], 


than  deal  with  all  these  outputs,  the  teaching  output  might  be  viewed  as  a 
by-product  or  secondary  output  and  the  patients  might  be  viewed  as  a  single 
output  rather  than  segregate  this  into  different  categories  of  severity. 

This  simplifying  procedure  is  not  wholly  defensible  from  a  cost  accounting 
standpoint.  Nevertheless,  in  the  absence  of  any  other  way  of  combining  and 
weighting  the  outputs,  similar  approaches  have  been  used  for  hospitals  as 
well  as  other  types  of  DMUs  (see  for  example  [23]),  and  this  is  the  way  we 
shall  proceed. 

Table  2  column  (A)  reflects  the  average  cost  per  patient  for  each  DMU. 

This  results  in  a  ranking  of  hospitals  reflected  by  the  parenthesized  number 

directly  to  the  right  of  the  average  cost  figure  in  Table  2.  The  lowest 
cost  (most  efficient)  DMU  is  ranked  1  and  highest  cost  (least  efficient)  DMU 
is  ranked  13.  This  ranking  erroneously  classifies  H13  (ranked  6)  as  more 
efficient  than  H3  (rank  7)  and  H6  (rank  9)  and  it  classifies  H9  as  more 
efficient  than  H6.  In  addition,  there  Is  no  objective  means  for  determining 
the  cutoff  cost  level  to  segregate  efficient  and  inefficient  units. 

If  the  efficient  relative  costs  of  certain  outputs  are  known,  the  outputs 
can  be  weighted  to  reflect  a  cost  per  weighted  unit  of  output.  In  this  case  we 
know  the  efficient  cost  of  a  regular  patient  ($130)  and  a  severe  patient 
(3170)  and  the  patient  units  can  therefore  be  weighted  to  value  each  severe 
patient  as  the  equivalent  of  170/130  ~  1.3  regular  patients.  For  example, 

HI  would  have  adjusted  patient  output  units  of  3000  regular  patients  + 

2000  x  1.3  severe  patients  for  an  adjusted  total  of  5600  patients.  Dividing 
this  patient  total  into  $775,500,  the  total  cost  for  HI  shown  in  Exhibit  3, 
results  in  $138.48,  the  case  mix  adjusted  average  cost  shown  for  HI  in 
column  (B)  of  Table  2. 
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Table  2 

Single  Output  Measures 

Case  Mix  Case  Mix  Adjusted  Average  Cost  per 

Adjusted  Patient  Segregated  into  High  and  Lev, 
Average  Cost  Average  Cost  Levels  of  Teaching  Outputs 

Hospital  per  Patient  per  Patient  Low*  High* 

Efficient  Units  (A)  (B)  (C)  (D) 


HI 

$155.10 

(2) 

$138.48 

(4) 

$138.48 

12) 

H2 

163.32 

(5) 

138.40 

(3) 

138.40 

(1) 

H3 

168.32 

(7) 

142.65 

(8) 

$142.65 

(3) 

H4 

160.10 

(4) 

142.94 

(9) 

142.94 

(5) 

H5 

158.38 

(3) 

137.73 

(2) 

137.73 

(2) 

H6 

170.15 

(9) 

140.12 

(5) 

140.12 

(3) 

H7 

142.60 

(1) 

135.81 

(1) 

135.81 

(1) 

Inefficient  Units 

H8 

176.95 

(11) 

157.99 

(12)** 

157.99 

(6) 

H9 

168.32 

(7) 

142.64 

(7) 

142.64 

(5) 

H10 

169.69 

(8) 

161.61 

(14)** 

161.61 

(7) 

Hll 

170.33 

(10) 

153.10 

(10) 

153.10 

(7) 

H12 

178.33 

(12) 

155.07 

(11) 

155.07 

(5) 

H13 

165.68 

(6) 

142.00 

(6) 

142. CO 

(4) 

HI  4 

178.33 

(12) 

155.07 

(11) 

155.07 

(5) 

HI  5 

179.74 

(13) 

160.48 

(13)** 

160.48 

(8) 

Mean  167.02  146.94  144.77  149.42 

Standard  Deviation  8.82  7.36  9.66 

*  Low  teaching  outputs  were  50  units  and  high  teaching  outputs  were  100  units  as  per 
Exhibit  3,  Col.  6. 

**Hospitals  more  than  one  standard  deviation  over  average  cost. 
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The  adjusted  cost  per  patient  is  reflected  in  column  (B)  of  Table  2 
with  the  new  ranking  in  parenthesis  immediately  to  the  right  of  the 
average  cost  per  day.  Even  with  this  (normally  not  available)  weighting 
of  patients  we  continue  to  have  a  misranking  with  inefficient  DMU s  H9 
and  Hi 3  being  ranked  as  more  efficient  than  H3  and  H4.  If  we  further 
segregate  the  15  DMUs'  by  the  third  output  (teaching),  as  is  sometimes 
done,  and  separate  them  based  on  those  with  high  (100  units)  versus  low 
(50  units)  teaching  outputs,  the  ranking  based  on  unit  costs  is  reflected 
in  columns  0  and  D  in  Table  2.  At  this  point,  we  have  achieved  an 
accurate  ranking  for  the  high  teaching  output  hospitals  but  we  still  have 
not  achieved  an  accurate  ranking  for  the  low  ones.  Because  we  have  only  two 
values  for  these  outputs,  at  50  and  100  "teaching  units,"  we  could  distinguish 
high  vs.  low  output  hospitals  fairly  easily  in  the  present  case,  but  generally 
there  will  be  many  more  values  to  consider  with  no  objective  guidance  available 
for  separating  high  from  low  teaching  output  values  and  the  difficulty  of 
distinguishing  efficient  from  inefficient  DMUs  will  then  be  compounded. 

The  problem  of  locating  a  point  beyond  which  DMUs  are  considered 

inefficient  is  typically  addressed  by  establishing  a  subjective  cutoff 

value>  even  though  there  is  no  assurance,  theoretical  or  otherwise,  that 

the  inefficient  units  will  be  accurately  located  through  this  process. 

For  example,  if  the  cutoff  was  set  at  one  standard  deviation  above  the 

mean  adjusted  cost  per  patient,  only  3  DMUs  (H8,  HI 0  and  H 1 5 )  would  be 

1/ 

identified  as  inefficient  as  indicated  in  column  (B)  of  Table  2. 

The  DEA  ratings  in  Table  1  do  not  lend  themselves  to  rankings  of 
the  kind  used  in  Table  2.  As  will  b'-  seen  below,  these  efficiency  measures 

T/At  0.6745a  =  5.95,  three  more  DMUs  (HIT,  H12  and  H14)  would  be  added  to 
this  inefficient  set.  We  record  this  as  an  additional  possibility  for 
improving  this  kind  of  identification  even  though  most  of  the  commonly 
used  adjustments  are  in  the  direction  of  ko,  with  k  >  1. 
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are  intended  to  supply  estimates  of  excessive  resource  utilization  relative 
to  the  Efficiency  Reference  Sets  from  which  these  ratings  are  derived.  If, 
on  the  other  hand,  one  uses  the  estimated  resource  savings  as  a  basis  and 
accords  the  same  ranks  to  DMUs  with  equal  efficiency  ratings,  a  more 
informative  set  of  ranks  would  be  available  from  Table  1  than  Table  2.—^ 
Whether  ranked  or  no t,_  however,  Table  1  is  more  informative  than  Table  2 
provided,  of  course,  that  the  efficiency  values  exhibited  in  Table  1  are 
reasonably  accurate. 

7.  Regression  Analysis 

In  industries,  including  the  "health  industry,"  where  the  efficient 
input-output  technology  is  not  known  with  any  real  precision,  regression 
analysis  has  been  applied  in  order  to  gain  "insights"  into  the  production 
relationships  that  might  underlie  the  observations  that  have  been  generated 
from  past  utilization  of  these  processes.  There  are,  of  course,  a  variety 
of  problems  that  are  encountered  when  using  traditional  regression 
analyses  to  evaluate  the  efficiency  of  individual  DMUs.  One  problem  in 
most  such  studies  is  that  one  relatively  smooth  relation  is  posited 
to  obtain  the  parameter  estimates  that  are  needed.  Another  problem  is 
that  the  estimated  parameter  values  are  based  on  least  squares  estimates  which 
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also  need  to  impute  dollar  magnitudes  or  other 
tial  savings. 


provide  "mean"  or  "central  tendency"  values  that  reflect  a  mixture  nf 
efficient  and  inefficient  behavior  in  the  data  set.-^  Thus,  even  if  the 
posited  functional  forms  are  correct,  the  estimated  regressions  will  only 
reflect  efficient  relationships  if  all  units  in  the  study  are  themselves 
efficient.  Whatever  reasons  may  be  used  to  justify  such  assumptions  in 
competitive  industries,  they  are  likely  to  be  much  weaker  in  not-for- 
profit  settings  such  as  education,  health,  and  government. 

Nevertheless  such  approaches  have  been  extensively  employed  and  so 
we  now  consider  the  extent  to  which  regression  analysis  as  it  has  been 
used,  e.  g.,  in  health  studies,  might  be  employed  to  identify  the 
inefficient  units  in  the  artificial  data  set.  In  the  process  we  shall 
also  locate  other  potential  problems  in  the  use  of  such  analyses  even 
when  we  can  validly  make  the  advantageous  assumptions  tnat 

all  DMUs  have  the  same  technology  and  pay  the  same  prices  for  all  inputs. 

One  part  of  our  analysis  involves  a  simple  linear  (additive)  regression 
model  in  which  total  cost  was  estimated  as  a  function  of  the  three  outputs 
produced  by  each  DMU.  The  results  were  as  follows: 

C  =  -95.300  +  152  y1  +  182.4  y£  +  1302  y3 
(8)  (22.2)  (767) 

where  C  =  Total  cost  per  year  (7) 

y^  =  H  of  regular  patients  treated  per  year 
y^  =  #  of  severe  patients  treated  per  year 
y3  =  Training  units  provided  in  one  year 

1_/  Recent  literature  has  begun  to  supply  a  variety  of  means  for  addressing 
some  of  these  problems  when  regression  estimates  for  securing  efficiency 
evaluations  are  wanted.  They  do  not  appear  to  be  very  satisfactory,  however 
and  so  we  do  not  examine  them  here.  See  Banker,  Charnes,  Cooper  and 
Maindaratta  [  2  ].  We  confine  ourselves  only  to  those  types  of  regressions 
which  have  been  commonly  (and  widely)  employed.  See,  e.  g.,  [34], 


The  standard  errors  noted  in  the  parentheses  below  each  coefficient 

indicate  high  levels  of  statistical  significance.  The  coefficient  signs 

are  positive,  as  required,  and  the  relation  between  the  y^  and  y ^  ('for 

regular  and  severe  patient)  coefficients  is  in  the  correct  (plausible) 

2 

direction.  A  high  R  value  of  0.97  suggests  a  good  fit  with  the 
observational  data  so,  by  standard  reasoning,  a  high  degree  of  cost 
variation  is  "explained"  by  these  independent  variables.—^ 

The  only  apparent  discrepancy  is  a  fixed  negative  cost  estimate  of 
$95,300.  This  value,  which  is  not  statistically  significant,  might  cause 
the  model  to  be  questioned  especially  in  cases  involving  hospitals  with 
relatively  small  outputs.  Hence  another  regression  with  its  total  cost 
intercept  fixed  at  zero  was  calculated.  We  do  not  reproduce  the  results 
here,  however,  since  (consistent  with  what  has  just  been  said)  the 
resulting  coefficient  values  did  not  differ  greatly  from  those  given  in  (7). 
Hence  the  latter  might  be  used  to  estimate  the  incremental  cost  per  unit 
of  each  output  as  in  the  second  column  of  the  following  tabulation: 


Estimated 

Efficient 

Incremental 

Incremental 

X 

Output 

Cost 

Cost 

Deviation 

yl 

$  152. 

$130 

17.0 

y2 

$  182.40 

$170 

7.3 

y3 

$1302. 

$500 

160.0 

!_/  The  independent  variables 
as  follows: 

"were  found  to 

have  fairly  low  inter-correlations 

Pyly2 

=  -0.37; 

rW3  *  -°'031 

Vs  *  -°-08- 
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Focusing  on  the  incremental  costs  in  this  manner  bypasses  the 
difficulties  associated  with  a  negative  intercept  value.  It  also 


corresponds  to  an  assumption  (not  often  stated  explicitly)  that  the  slope 
coefficients  may  still  parallel  the  true  incremental  efficiency  values, 
at  least  roughly,  in  a  manner  that  corresponds  to  a  shift  of  the  regression 
plane  up  to  the  frontier  without  altering  its  slopes.-^  In  the  present 
case,  we  know  the  incremental  costs  for  efficient  operations  and  these  are 
supplied  in  the  third  column.  The  estimates  from  the  regression  are  high 
in  every  case.  Only  the  estimate  for  (=  severe  patients)  is  even 
tolerable  and  the  estimated  cost  for  y3  (=  teaching)  is  very  wide  of  the  mark 

Another  use  of  such  regressions  is  to  evaluate  efficiencies  as  was  done 
by  Feldstein  [18]  in  his  now  classic  study  of  British  hospitals.  That  is 
the  actually  observed  outputs  for  each  of  HI  to  HI 5  would  be  inserted  in  an 
expression  like  (7)  and  the  resulting  total  cost  would  then  be  compared 
with  the  corresponding  actual  costs  at  this  hospital.-^  The  presence  oi  a 
negative  intercept  value  could  be  troublesome,  however,  and  alternate 
forms  of  regression  functions  might  then  be  explored. 


]_/  This  method  of  parallel-shift  treatment  is  explicitly  incorporated  in 

some  of  the  "frontier  estimation"  methods  that  have  recently  been  devised. 
See  Forsund,  Lovell  and  Schmidt  [19]. 

2/  A  variety  of  adjustments  might  be  employed  to  allow  for  different  hospital 
characteristics  and  patient  mixes,  etc.  See  Feldstein  [18]  for  further 
discussion. 
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Another  type  of  function  that  has  been  commonly  employed  in  hospital 
studies,  is  the  so-called  Cobb-Douglas  form.  This  form  has  the  advantage 
of  avoiding  the  possibility  of  negative  intercepts  and  since,  in  the 
present  data  set,  no  zero  outputs  are  present  for  any  of  the  hospitals  we 
can  also  avoid  difficulties  that  are  sometimes  experienced  from  this  quarter. 
Thus  we  now  turn  to  *uch  a  Cobb-Douglas  approach. 

In  logaritnmic  form  our  estimated  relation  obtained  from  the  data  of 
Exhibit  3  is 

In  C  =  3.98  +  .62  In  y1  +  .57  In  y2  +  .10  In  y3  (8) 

(.04)  (.07)  (.05) 


v' 


1 

■ 


which,  in  the  usual  Cobb-Douglas  representation,  becomes 


C  =  53.79  y 


0.62  0.57  ..  0.10 


1 


(9) 


In  this  case  the  coefficients  in  (8)  and  hence  the  exponents  in  (9)  all 
appear  to  be  reasonable  as  well  as  significant.  In  sum,  however,  the 
exponent  values  (.62  +  .57  +  .20)  exceed  1  which,  being  significant, 
means  that  evidence  of  decreasing  returns  to  scale  is  present,  nr  at 
least  this  possibility  cannot  be  rejected.  In  our  case  this  may  reflect 


the  complementary  and  substitution  relations  that  are  known  to  be  present 

in  some  of  the  inputs. The  regression  does  not  detect  these  relations  in 

this  form,  however,  and  the  fact  that  it  results  in  a  significant  value 
2 

(with  R  =  0.96)  could  lead  to  erroneous  recommendations  with  respect  to 
decisions  on  the  scale  of  operations. 


TJ  E.  g.,  as  reflected  in  A  x  =  y  when  going  from  x  =  Av,  with  A  a  matrix 
of  positive  constants  as  in  (1).  Thus,  in  general,  A~i  will  have 
negative  as  well  as  positive  elements  reflecting  relations  of 
complementarity  as  well  as  substitution  among  the  various  inputs  used 
in  producing  these  output  combinations.  See  Sherman  [29]  for  further 
discussion. 
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If  we  now  consider  DMUs  as  potentially  inefficient  when  their  actual 


total  cost  exceed  the  estimated  total  cost  in  (9),  then  efficient  DMUs 
H2,  H6,  and  H7  would  be  erroneously  considered  inefficient  and  inefficient 
DMUs  HI  1 ,  HI 2 ,  H13,  and  H14  would  be  identified  as  efficient.  These  results 
together  with  the  results  of  our  preceding  analysis  are  drawn  together  and 
presented  in  Table  3.  In  identifying  which  DMUs  are  efficient  or  inefficient, 
DEA  has  evidently  done  better  than  the  others  with  the  exception  of  the  cost 
ratio  approach  when  the  latter  is  (a)  adjusted  for  case  mix  and/or  (b) 
identified  with  "low"  and  "high"  levels  of  teaching  outputs.  There  is,  of 
course,  a  degree  of  arbitrariness  present  in  these  cost  ratio  efficiency 
and  inefficiency  characterizations  that  provide  these  favorable  results  for 
comparison  with  DEA.  Furthermore  the  Case  Mix  adjustment  procedure  we 
used  presupposes  a  knowledge  of  the  efficient  cost  of  operations  and  this 
is  reflected  in  the  results  shown  in  both  columns  (B)  and  (C)  in  Table  3. 
Normally  these  costs  will  not  be  known  and  so  we  may  count  the  apparently 
favorable  results  of  these  ratio  analyses  as  proceeding  from  an  assumed 
knowledge  that  will  generally  not  be  available.  This  knowledge  is  not 
required  by  DEA  and  hence  we  may  regard  it  as  being  superior  to  the  ratio 
analysis  in  these  respects  as  well  as  in  other  respects  that  we  shall  begin 
to  examine  after  first  summarizing  some  of  our  other  findings  to  this  point 
as  follows: 


1.  Ratio  (cost)  analysis  and  regression  analysis  required  an 
arbitrary  rule  to  determine  which  DMUs  would  be  designated  as 
inefficient.  With  ratio  analysis,  the  mean  might  well  have  been 
lower  or  higher  depending  on  whether  there  were  more  or  fewer 
efficient  units  in  the  data  set.  Similarly,  regression  analysis 
might  also  have  a  lower  or  higher  cost  curve  depending  on  the 
relative  number  of  inefficient  units. 

2.  Ratio  analysis,  a’  did  regression  analysis,  required  price  data 
and  other  adjustments  to  address  the  multiple  output  and  input 
situation  while  DEA  could  address  this  situation  directly.  In 
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addition,  the  ratios  would  be  confounded  if  DMUs  paid  different 
prices  for  similar  inputs.  For  example,  a  DMU  that  had  very  low 
prices  might  have  a  lower  average  cost  that  could  obscure  the 
presence  of  technical  (production)  inefficiencies.  Regression 

.  analysis  also  assumed  DMUs  had  the  same  costs/input,  and  different 
unit  input  costs  would  have  shifted  the  cost  function  and  could 
thereby  also  conceal  ineff iciencies. 

3.  Regression  ana-lysis  results  depended  on  the  selection  of  an 

appropriate  model  or  set  of  cost  relationships  and  nothing  in  the 
data  set  suggested  that  either  of  the  choices  were  not  appropriate. 
DEA,  however,  required  no  such  assumptions. 


There  are  other  points  that  can  also  be  made  as  we  move  beyond  mere 
classification  into  identifying  the  particular  inputs  where  inefficiencies 
occur  and  estimating  their  amounts.  This  will  be  dealt  with  in  the  sections 
that  follow. 
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Table  3 

Comparison  of  DEA,  ratio  analysis,  and  linear  regression  approaches 
ability  to  locate  Inefficient  DMU's 

E  =  DMU  rated  as  efficient 
I  =  DMU  rated  as  inefficient 


DEA  (1) 

Efficient  DMU's  Results 


(B)  (C)  (D) 

Case  Mix  Adjusted 
AvGT3Q0 

Jat;°  !2>  Cost/Patient  (3)  f®9r®”ionJ4) 
Analysis  (Cobb/Douqlas 


Inefficient  DMU's 

H8 

I 

I 

I 

I 

H9 

I 

E 

E 

I 

H10 

E 

I 

I 

I 

HI  1 

I 

E 

I 

E 

H12 

I 

E 

I 

E 

HI  3 

E 

E 

E 

E 

H14 

I 

E 

I 

E 

HI  5 

I 

I 

I 

I 

(1)  From  table  1 

(2)  From  table  2  column 
deviation  above  the 

B  -  DMUs 
mean  used 

with  cost/patient  greater  than 
to  identify  inefficient  DMUs. 

one  standard 

(3)  From  Table  2  columns  C  and  D  with  cost/patient  greater  than 
deviation  above  the  mean  used  to  identify  inefficient  DMUs. 

one  standard 

(4)  Based  on  rule  that 
cost  (based  on  the 

DMUs  with  actual  total 
regression  model)  are 

cost  greater 
inefficient. 

than 

estimated  total 

8.  Extensions 


Perhaps  the  easiest  approach  to  the  topic  of  identifying  the  sources 
and  estimating  the  amounts  of  inefficiency  present  in  each  DMU  is  to 
begin  with  a  specific  example.  We  therefore  begin  with  H15  as  an 
illustration  of  these  kinds  of  additional  uses  of  DEA.  This  hospital,  which 
is  inefficient,  has  already  been  discussed  in  association  with  HI  in 
Exhibit  3.  We  now  approach  it  in  a  different  manner  as  follows. 

★ 

First  consider  the  value  of  hQ  =  0.87  in  Table  1.  Here  we  shall  use 
this  value  to  obtain  the  results  shown  in  the  column  labelled  "Intensity  Adjusted 
Value"  -in  Table  4.  Because  slack  values  also  need  to  be  considered  in 

assessing  efficiency  we  may  refer  to  these  h*  values  as  "intensity  factors" 

•* 

and  use  them  in  the  manner  of  the  h  =  0.87  value  that  is  applied  to  each 
of  the  inputs  in  Table  4.  The  value  which  is  then  obtained  in  the  case  of 
HI 5  can  then  be  compared  with  the  corresponding  value  shown  under  the 
column  labelled  "True  Efficiency  Value".  The  latter  are  the  values  of  the 
inputs  actually  needed  for  the  outputs  of  H15  with  efficient  operations,  as 
obtained  from  the  efficient  coefficient  values  provided  in  Exhibit  1.  The 
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maximum  discrepancy  of  $(139,200-130,000)=  $9,200  or,  approximately,  7% 
occurs  in  the  case  of  Supply  $.  The  other  DEA  estimates  resulting  from 
the  intensity  adjustment  factor  applied  to  the  observed  inputs  are  within 
2%  and  0.3%,  respectively,  of  the  true  efficiency  values. 


TABLE  4 

HI 5  INTENSITY  ADJUSTMENT  AND  EFFICIENCY  VALUE 


Adjusted  Input  Values 


Efficient  Input  Values 


Observed  Intensity  Intensity 

Input  Adjustment  Adjusted 
Value  Factor  Value 


Adjustments 


Reaular 


Severe 


True 

Teach  Efficiency 
Units  Value 


FTE:  26.5  x  0.87  =  23.055 

Bed  Days:  47,370  x  0.87  =  41,211.9 

SUPPLY  $ :  160,000  x  0.87  =  139,200 


.004  x  3,000  +  .005  x  2,000  +.03  x  50  =  23.5 
(7  x  3,000  +  9  x  2,000)  0.95*  =  41  ,052 
20  x  3,000  +  30  x  2,000  +  200  x  50  =130,000 


*0.95  =  vacancy  factor  for  efficient  production.  See  assumption 
(a)  in  Exhibit  1 . 


I 
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Evidently  our  h*  value  has  operational  significance  in  that  it 
indicates  "amounts"  of  inefficiency  that  are  present.  It  thus  differs 
from  the  index  numbers  and  like  approaches  that  are  sometimes  used  for 
efficiency  ratings.  See,  e.  g.,  the  index  constructed  by  Feldstein  [18] 
for  use  in  the  case  of  British  hospitals. 

As  indicated  earlier,  the  presence  of  slack  in  an  optimal  tableau  is 
also  to  be  considered  a  source  of  inefficiency,  and  these  data,  too, 
are  available  from  the  simplex  tableaus.  In  particular,  the  slack  value 
for  Supplies  in  the  optimal  solution  amounts  to  $11,880  and  955  Bed  Days 
of  slack  are  also  present.  When  these  amounts  are  subtracted  from  the 
Intensity  Adjusted  Yalues  in  rows  3  and  2  of  Table  4  new  estimates  for 

efficient  inputs  in  these  factors  become  $127,313  and  40,257  BD,  respectively. 
This  greatly  improves  the  efficiency  estimate  of  the  former 
along  with  some  worsening  of  the  latter.  All  estimates  are  now  within  about 
2%  of  the  true  efficiency  value. 
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It  is  not  contended  that  DEA  efficiency  estimates  will  always  be  this 
close  and,  indeed,  reference  to  Table  5  will  show  estimates  that  are  very 
wide  of  the  mark  for  HI 0  in  at  least  2  of  the  3  pertinent  input  categories. 

On  the  other  hand,  even  in  £his  case  the  estimates  are  both  better  and  more 
detailed  than  those  obtained  from  the  ratio  and  regression  approaches 
discussed  earlier  in  this  article.  Alsc,  as  was  observed  in  our  discussion 
of  Table  1,  there  are  strong  reasons  to  suspect  the  hQ*  =  1  intensity 
values  for  HI 0  and  H13.  Elimination  of  these  two  hospitals  still  leaves 
HI  1  with  errors  in  the  range  of  1 0-1 b%  for  three  of  the  input  estimates,  while  all 
of  the  other  errors  are  in  a  range  of  about  at  2%  or  less.  Furthermore 
this  record  is  considerably  improved  when  the  efficient  hospitals,  HI  to  H7, 
are  added  to  the  list  since  in  their  case  the  estimates  all  have  zero  errors. 

This  seems  to  be  a  very  creditable  performance,  at  least  ccroared  to 
what  the  other  approaches  appear  to  offer  for  use  on  the  data  base  we 
have  erected.  Further  testing  will  also  be  required  both  on  other  data  bases 
and  in  actual  uses,  of  course,  and  improvements  in  the  methodology  and 
alternate  modeling  approaches  and  estimation  methods  will  also  need  to  be 
explored. 

Methods  by  which  such  testing  might  be  done  will  be  discussed  in  the 


next  section.  We  can  then  conclude  this  section  by  noting  that  still  other 
uses  of  DEA  are  also  possible.  For  instance,  what  we  have  been  doing  in 
this  section  amounts  to  projecting  each  DMU  onto  the  relevant  position  of 
the  efficiency  surface  in  conformance  with  the  methods  prescribed  in  [13]. 

Further  tradeoffs  may  then  be  effected  by  reference  to  the  marginal  rates 

★  ★ 

of  transformation  and/or  substitution  via  the  optimal  u„  and  v^  values 


which  may  be  secured  from  the  simplex  tableaus.  See  (6).  These  values  can 
provide  guidance  for  augmenting  or  contracting  the  inputs  and  outputs  of 
the  corresponding  DMU  and,  at  the  same  time,  provide  controls  and  guidance 

on  efficient  uses  by  the  managers  of  these  DMUs. 

★  ★ 

These  ur  and  v.  values  will  represent  estimates  which,  of  course,  may 
not  be  wholly  accurate.  The  same  is  true  of  the  similar  uses  of  regression 
estimates  but,  in  addition,  such  regression  estimates  can  be  expected  to  be 
very  wide  of  the  efficiency  values--as  should  be  clear  from  our  earlier 
discussions.  Indeed,  as  noted  in  [2],  the  estimates  of  such  substitution 
and  transformation  rates  generally  continue  to  be  very  far  from  the  true 
efficiency  values  even  when  the  simple  forms  of  regression  functions  used 
in  the  present  article  are  replaced  by  more  general  and  flexible  forms  and 
when  the  statistical  methods  used  are  specifically  directed. toward  frontier 
efficiency  estimates. 


HOSPITAL 
INPUTS  ' 

OBSERVED 

VALUE 

INTENSITY 

ADJUSTED 

VALUE 

SLACK 

ESTIMATED 

EFFIC. 

VALUE 

FTE 

25.0 

24.75 

_  _ 

24.75 

H8 

BD 

49,475 

48,980 

8,425 

40,555 

$S 

140,000 

-  138,600 

-- 

138,600 

FTE 

24.5 

24.01 

24.01 

H9 

BD 

43,160 

42,297 

_  _ 

42,297 

SS 

165,000 

161,700 

25,000 

136,700 

FTE 

77.0 

77.0 

77.0 

HI  0 

BD 

92,630 

92,630 

_ 

92,630 

$S 

'  340,000 

340,000 

-- 

340,000 

FTE 

44.5 

37.8 

5.1 

32.7 

HI  1 

BD 

65,260 

55,471 

_ 

55,471 

SS 

265,000 

225,250 

45,711 

179,539 

140,000 


24.5 

43,158 

140,000 


53.0 

92,632 

280,000 


HI  3 

FTE 

BD 

SS 

43.5 

81,110 

245,000 

43.5 

81,110 

245,000 

43.5 
81,110 
"  245,000 

FTE 

30.0 

29.7 

29.7 

H14 

BD 

60,000 

59,400 

9,476  49,924 

SS 

170,000 

168,300 

168.300 

-  30.0 
50,526 
170,000 


30.0 

50,526 

170,000 


160,000 


139,200 


9.  Conclusion 


The  really  surprising  result  is  not  how  well  DEA  performed  on  our 
manufactured  data  base,  but  rather  the  poor  performance  of  the  econometric- 
statistical  models  we  employed.  These  models  are  representative  of  many 
analyses  that  have  been  employed  in  studies  used  to  draw  important  policy 
conclusions.  Two  recent  multi-million  dollar  studies  of  this  kind  that 


resulted  in  multi-volume  reports  with  important  findings  for  policy 

formation  are:  (1)  U.S.  Department  of  Health,  Education  and  Welfare, 

PSRO:  An  Initial  Evaluation  of  the  Professional  Standards  Review 

Organization  [in  Health  Care  Delivery]-^  and  (2)  U.S.  Office  of  Edu  ation. 

The  Follow  Through  Planned  Variation  Experiment  [for  Education  of 

2  / 

Disadvantaged  Children] . — 

The  questions  raised  by  our  across-DMU  regression  results  would  seem 
to  apply  a  fortiori  to  studies  like  these  since  in  our  case  the  design  of 
the  data  base  was  favorable  to  assumptions  such  as  a  conmon  technology  and 
a  common  price  structure  across  the  DMUs.  Assumptions  like  these  are  much 
less  likely  to  be  valid  for  regressions  used  in  applied  studies,  such  as  the 
kinds  we  just  cited. 


It  might  be  argued  that  it  is  unfair  to  level  criticisms  such  as  these 
at  regression  models  designed  to  handle  only  one  dependent  variable  at  a 


time  and  using  methods  of  estimation  directed  toward  average  rather  than 
.3/ 

efficient  behavior.—  In  the  study  [2],  which  we  conducted  with  R.  Banker  and 

A.  Maindiratta,  however,  both  of  these  qualifications  were  accomodated. 

1/  See  [32].  See  also  [17]  for  further  discussion  and  suggestions  for 
alternative  approaches. 

2/  See  [33  ].  See  also  [12]  for  further  discussion  and  suggested  alternative 
approaches. 

3/  Note,  however,  the  study  by  Feldstein  [is]  which  was  conducted  in  just 
this  manner  and  numerous  other  studies  of  this  type  can  also  be  cited. 

See  also  the  study  by  Banker,  Conrad  and  Strauss  [4]  which  consisted  of 
a  DEA  redo  of  a  previously  conducted  econometric  study  of  North  Carolina 
hospitals  (using  a  translog  function)  and  arrived  at  drastically  different 
conclusions  on  the  presence  of  returns  to  scale,  etc.,  which  had  been 
found  not  to  be  present  in  the  original  (econometric)  study. 
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In  that  study,  conducted  in  the  same  spirit  as  the  one  we  are  presently 
summarizing,  a  piecewise  Cobb-Douglas  function  with  one  output  as  the  dependent 
variable  was  used  to  represent  a  continuous  technology  with  increasing  and 
decreasing  returns  to  scale  in  its  various  segments.  Technical  as  well  as 
scale  inefficiencies  were  then  introduced  into  randomly  generated  observations 
as  a  basis  for  comparing  DEA  with  so-called  flexible  functional  form 
approaches  using  translog  regressions.  DEA  again  performed  very  well  but, 
perhaps  even  more  importantly,  the  statistical -econometric  approaches 
performed  poorly— not  only  relative  to  DEA  but  also  in  a  manner  that  was 
unsatisfactory  per  se--in  both  technical  and  scale  efficiency  identification 
and  estimation.  Moreover,  the  estimation  methods  employed  for  the  regressions 
in  this  case  were  of  the  so-called  "corrected  least  squares"  varieties,  as 
specifically  designed  for  the  purpose  of  locating  and  estimating  efficiency 
frontiers.  See  [26]  and  [19]. 

One  possible  source  of  trouble,  we  think,  lies  not  merely  in  the 
estimation  methods  but  rather  in  an  approach— the  one  that  is  commonly  taught 
and  employed— which  tries  to  capture  a  great  variety  of  behaviors  in  only 
relatively  smooth  and  simple  (e.  g.,  unconstrained)  functional  forms. 

Attempts  to  meet  these  difficulties  by  weighted  regressions,  outlier  analyses 
and  similar  approaches  do  not  really  deal  with  the  problem  in  a  sufficiently 
fundamental  way,  we  think,  and  other  alternatives  need  to  begin  to  be 
considered. 

The  optimizations  involved  in  these  DEA  and  statistical  approaches 
also  need  to  be  considered.  Generally  speaking  the  commonly  employed 
statistical  approaches  optimize  over  al_l_  observations  while  DEA  optimizes 
relative  to  each.  Another  way  of  stating  this  is  to  note  that  a  complete 
DEA  e,.alysis  will,  in  general,  involve  n  optimizations,  one  for  each 
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observation,  while  the  usual  statistical  approach  involves  only  one. 

This  implies  that  differences  in  testing  for  results  and  checking 
for  possible  inferences  must  also  be  expected.-  Because  it  is  directed 
toward  individual  observations,  DEA  is  also  directed  to  each  DMU  in  a 
way  which  suggests  this  as  a  fundamental  unit  of  test.  That  is,  the 
inferences  that  are  made  about  at  least  some  of  these  DMUs  can  and  should 
be  tested  by  on-site  observations  in  ways, and  with  results,  that  differ 
from  testing  statistical  estimates  for  general  types  of  class  properties 

effected  across  all  observations. 

Having  identified  these  differences  and  their  possible  separate 

avenues  of  application,  testing  and  research,  we  can  probably  best  close 

on  a  somewhat  different  note  by  indicating  ways  in  which  the  two  approaches 

might  be  joined  together.  One  possibility  is  to  use  each  approach, 

regression  or  ratio  analysis  and  DEA,  to  check  on  or  fortify  the  other.— ^ 

Other  possibilities  exist,  however,  which  might  briefly  be  sketched  as  follows. 

Aigner  and  Chu  in  [1],  essayed  a  new  approach  to  frontier  estimation 

2/ 

by  means  of  what  would  now' be  called  "goal  programming"—  with  only  one¬ 
sided  deviations  permitted  so  that,  in  general,  the  estimated  production 
function  (e.  g.,  a  Cobb-Douglas  form)  would  lie  on  or  above  all  of  the 
observed  output  values.  Confining  all  deviations  to  one  side  clearly  does 
not  exhaust  the  possibilities,  however,  and  one  may  go  on  to  prescribing 
proportions  of  the  total  deviations  or  even  deviations  for  individual 
observations  that  must  lie  on  one  side  or  the  other  of  an  estimated  frontier. 

In  a  similar  spirit,  C.  P.  Timmer  in  [30]  used  "chance  constrained 
programming"  formulations  and  concepts  to  effect  efficiency  estimates. 

1/  See  [12]  for  further  discussion  on  different  conditions  which  might  lead 
~  to  one  approach  or  the  other  in  complementary  fashion  for  policy  guidance 
purposes 

2/  This  was  originally  referred  to  as  "inequality  constrained  regressions." 

~  See  [10]  and  [8].  Although  not  available  at  the  time  of  the  Aigner-Chu 
work  [1]  we  would  now  add  the  further  possibilities  that  are  now  available 
from  the  goal  interval  programming  approaches  described  in  [9]. 

65 


,.V 


Instead  of  ut\ 1 i zincj  the  power  of  chance  constrained  programming,  e.  g.  to 
deal  with  different  proportions  and  even  different  probability  distributions, 
constraint  by  constraint,  Timmer  proceeded  in  an  entirely  different  direction 
and  in  the  spirit  of  a  "global"  statistical  analysis  discarded  "outlier" 
observations  one  after  another  until  he  achieved  what  he  regarded  as  "stable" 
estimates.  Notice,  however,  that  this  procedure  is  one  which  obliterates  a 
great  deal  of  information.  In  particular,  in  pursuit  of  one  global  (overall) 
property,—^  it  discards  efficient  DMUs  without  even  bothering  to  investigate 
them  individually. 

The  approaches  by  Aigner  and  Chu  [1]  and  by  Timmer  [30]  that  we  have 
just  described  involve  a  use  of  inequality  constrained  optimizations,  to 
be  sure,  but  they  otherwise  proceeded  in  the  spirit  of  classical  statistical 
approaches.  Something  more  may  also  be  accomplished  along  these  latter 
lines.  For  instance,  one  might  use  a  discriminant-function  or  cluster- 
analytic  approach  to  locate  subsets  of  the  original  points  which  have 
different  properties.  Hopefully  this  could  include  clusters  or  discriminant 
subsets  of  efficient  ard  inefficient  points.  Separate  regressions  fitted 
to  these  subsets  might  then  yield  improved  ways  of  identifying  inefficiencies 
and  estimating  their  amounts. 

We  have  not  investigated  the  latter  types  of  topics,  as  we  shall  do 
in  future  papers,  for  the  simple  reason  that  we  sought  to  adhere  as  closely 
as  possible  to  the  kinds  of  approaches  that  have  generally  been  used  in  the 
kinds  of  studies  we  have  been  considering.  Notice  that  a  use  of  the 
discriminant  and/or  cluster  analysis  approaches  we  have  just  described 
Involves  an  estimation  of  more  than  one  regression  relation  and  more  than  one 

1/  This  is  contrary  to  the  spirit  of  individual  observation 
Investigation  that  we  urged,  above,  and  for  which  the  kind  of  stability 
analysis  provided  in  [11]  is  now  available. 
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optimization.  The  other  approaches  of  global  programming  and  chance 
constrained  programming  varieties,  as  in  Aigner  and  Chu  [1]  and  Tirruner  [30], 
involve  inequality  constrained  relations  of  a  kind  that  are  similar  to  the 
ones  used  in  DEA.  Thus,  we  conclude  that  there  are  additional  avenues  of 
possible  relations  between  DEA  and  these  other  approaches  that  also  invite 


exploration. 
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